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ELINOR: The Electronic Library project at 
De Montfort University Milton Keynes 


Kathryn Arnold, 


Information Centre Manager, De Montfort University Milton Keynes 


Mel Collier, 


Head of Information Centre, De Montfort University 


Anne Ramsden, 


Information Officer, De Montfort University Milton Keynes 


De Montfort University Milton Keynes 

In November 1990 Leicester Polytechnic (now De 
Montfort University) won a contract to provide higher 
education in the new city of Milton Keynes, against 
strong competition from a number of other higher 
education institutions in the region. In January 1991 
the Polytechnic took possession of a green field site 
for the new campus from the Milton Keynes 
Development Corporation. Eight months later in 
September 1991 the first buildings had been erected, 
staff appointed, services organized, students recruited 
and the first courses commenced. This must surely 
claim to be the quickest ever inception of anew higher 
education institution. 


Higher education context 

The speed of this development is a potent symbol of 
the fundamental changes taking place in the higher 
education structure of the UK. De Montfort University 
is playing its full part in meeting the government’s 
objectives of doubling the age participation index and 
an overall increase of 50% in student numbers over the 
next decade. 

From the outset, however, it was clear that this new 
development would be taking place in a fundamentally 
different financial climate to that of the new universities 
of the sixties, and the polytechnics of the seventies. 
There would be extremely limited capital funds, no 
generous cushions of recurrent grant and very limited 
time. 

De Montfort University Milton Keynes is being 
developed therefore, as part of the De Montfort 

. University strategic plan to establish a multi-campus 
networked institution with major locations in the east 
and central heartlands of England. Itis being developed 
according toa business plan based on income generated 
from student growth which requires extremely careful 
attention to margins and efficiency. 


The library context 

The present state of academic libraries is that they are 
highly automated book collections. Some information 
is provided in electronic form through online services, 
local databases and CD-ROM. Huge resources are 


expended each year in maintaining and managing 
book collections. 

It is now a truism to state that the technology is 
available to make the long-heralded transition to the 
electronic library, but that there are many practical and 
theoretical factors which are inhibiting the transition. 
Much innovative work is being done in individual 
projects and libraries around the world, but few, ifany, 
academic libraries have yet set themselves the goal of 
transforming their resources into electronic form. At 
Milton Keynes, in setting up entirely new academic 
information services we have both the opportunity, 
the educational environment and the economic 
imperative to confront this challenge. 


Concept 
The Electronic Library is a teaching, learning and 
study environment for higher education in which 
information is held primarily in electronic form. It is 
not restricted to a physical locale. Users may access it 
from anywhere and it will give access to information 
held in many places. It will contain text, still and 
moving images, and sound. It will be intimately linked 
with the publishing and bookselling industry. 

This statement articulates the belief that conventional 


library services can only provide а partial answer to . 


the needs of higher education in the future. It states 
clearly that there will be a fundamental shift in the 
balance of resources from print sources to electronic 
sourced information. It implies that the proportion of 
resources expended on acquiring and maintaining 
collections will steadily reduce. 

The concept is in line with the need to effect 
substantial growth in higher education whilst making 
efficiency gains. Electronic information is inherently 
more cost effective to manage in staffing and space 
and offers almost unlimited scope for managing the 
information explosion. It also offers the only feasible 
answer to the current drive towards open, distance, 
independent and resource based learning. 

The statement recognizes the crucial role of the 
publishing and bookselling industry. Without their 
co-operation on the supply side the electronic library 
cannot develop. At the same time the publishers know 


f 


Aslib Proceedings, vol. 45, no. 1, January 1993. pp. 3-6 


ELINOR 





that the industry must restructure towards electronic 
sourcing of academic information in order to remain 
competitive. The publishers recognize that change is 
inevitable but will wish to control change in order to 
protect their business base. 


Aim and objectives of the Electronic Library 
Project 

Aim The project aim is to develop a fully electronic 
library environment within five years. Within this 
period, if not before, the information required by 
students and staff will be delivered primarily in 
electronic form or by electronic communication 
systems. Where there is an overriding practical or 
educational reason for preferring documents there 
will, of course, be exceptions but even then the 
documents will be sourced in electronic form in the 
system. 

The objectives of the project are as follows: 


1. To develop/procure/design appropriate 
workstations, networks, 
Storage and retrieval systems and software. 

2. To make agreements with copyright owners. 

3. To develop systems for import of information 
from publishers. 

4. To develop monitoring, and, as appropriate, 
charging mechanisms. 

5. To research user needs, satisfaction and 
outcomes. 

6. To design courses and materials around this 
concept. 

7. To research the educational implications. 


Benefits 
The benefits expected of developing the electronic 
library infrastructure are important and far reaching: 


Educational The trend in higher education is towards 
student centred learning with emphasis on resource- 
based investigative learning. Conventional library 
support will be limiting and inflexible. The Electronic 
Library will help students and staff produce higher 
quality work which is better informed, better researched 
and more up to date. 


Informational 'The information explosion renders 
conventional libraries totally unable to manage the 
volume of information which must be harnessed and 
made available to higher education. Only electronic 
storage, transmission and retrieval can cope with this 
challenge. 


Pedagogic The teaching process will increasingly 
embrace electronic techniques such as multimedia 
presentations and video conferencing. The Electronic 
Library will integrate into this environment very 
readily. 

Economic Higher education in the UK is very 


competitive and will become even more so over the 
next decade. Conventional library resources are staff 


and space intensive, conventional teaching techniques 
even more so. The Electronic Library will support the 
drive for competitiveness in delivery of higher 
education whilst improving quality of and access to 
information services. In particular the current plans 
for student growth will place existing library buildings 
under considerable pressure. The Electronic Library 
concept must be explored as an alternative to wholesale 
building development throughout the sector. 

It is envisaged that the project will provide insights 
and outcomes across a wide range of research areas 
including 


к 


. Image processing and retrieval 
Multimedia education 

. Text retrieval and indexing 
Human/machine interface 

. Data exchange and standards 

. Pedagogic development 
Library science 

. Higher education economics 

. Electronic publishing 


х оо - Ota S о рә 


Project funding and related work 

The project has been adopted by De Montfort 
University as a major component in its strategy to 
create a networked multi-campus institution and is 
therefore receiving substantial institutional funding. 
Additionally the project is supported by the IBM UK 
Scientific Centre in the form of expertise, equipment 
and the funding of a researcher for three years. The 
British Library Research and Development Department 
isalso supporting the project in the form of funding for 
a researcher for two years. 

The project has been code-named ELINOR and is 
one of the family of Electronic Library (EL) projects 
under development at De Montfort University. Sister 
projects are ELISE, an international research 
programme investigating full colour image banks for 
libraries; and ELECTRA, a planned programme to 
develop an integrated approach to delivery of media 
based learning in the institution. 


Staffing 

We have been fortunate to bring together a multi- 
disciplinary team of researchers to work on this project. 
Anne Ramsden is project co-ordinator and has a 
strong backgroundin library systems and text retrieval. 
Dianguo Zhao has a background in object orientated 
graphics and simulation and previous knowledge of 
library systems. Zimin Wu has a doctorate in library 
Science specializing in automatic indexing for text 
retrieval. In addition we are able to call on advice and 
assistance from IBM, publishers, and scientists and 
engineers from De Montfort University. 


Work so far 

Work offically started on the 1 March 1992 and 
progress to date can be divided into four main 
categories. 
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1. Defining the Requirements for the Electronic 
Library | 
In summary the overall requirement is to design and 
implement a system which will receive, store and 
retrieve material in electronic form. Material includes 
books, journals, abstracts, course notes, lectures, 
overheads, photographs and diagrams. 

Features are to be included in the electronic library 
which will: 


1. Accommodate material in character encoded or 
image format, 

. Allow copyright control and monitoring, 

. Allow charging by usage, 

. Be manageable by non-specialists, 

. Provide sophisticated text retrieval, 

. Provide OCR, 

. Provide a range of printing options, 

. Provide a range of security features, 

. Provide display facilities that are friendly to 
users and resemble a reading environment, 

10. Provide acceptable image display resolution, 

11. Provide facilities to enhance images, 

12. Integrate with other facilities, eg. OPAC, 

e-mail, CD-ROM network. 


© 00 -1 ON LA Б ор 


2. Defining a pilot project 

The aim ofthe pilot project is to implement and assess 
the suitability ofthe latest document image processing 
technology forcreating a collection of electronic texts 
for this library. Rather than try to meet the information 
needs of the polytechnic at once, we have opted to 
select one coutse to pilot on the system. The BA/BSc 
Business Information Systems course was chosen 
because it is self-contained and it is varied in content 
including a European language component. The same 
course is also run at De Montfort University Leicester, 
so there will be opportunity to compare usage of 
information sources by conventional and electronic 
means at both sites. The pilot project will not only test 
the Electronic Library concept in a working situation 
butalso familiarize the Information Centre in selecting, 
implementing and working with a document image 
processing system. 

The pilot project is for two years during which time 
it is planned to gain real experience of working in a 
fully electronic library environment. It is estimated 
that over the three years of the piloted course the 
document volume to be inducted into the system will 
be around 80,000 pages, and that each student will 
access the system on average for about !/?hour per day. 
In all three years of the course there will be 130 
students. 

The wide range of documents described above will 
come in a variety of formats including variable page 
size, variable column arrangement, .complex 
documents with inset pictures and of varying quality 
including print, typed, handwritten, dot matrix print 
and so on. 
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A detailed checklist forevaluating the pilot has been 
drawn up, summarized under the following categories: 


Document preparation 
Input 

OCR/ICR 

Image handling . 
Quality control 
Storage 

Compression 

Image retrieval 
Indexing 

Text retrieval 

System architecture and performance 
System management 


3. Specification and procurement of pilot system 
The next part of the work was to define the required 
hardware and software configuration and survey the 
market. The choice of system is, to a certain extent, 
influenced by equipment in place consisting of multiple 
PC's (386, 4mb RAM, 40mb HD) for user enquiry 
stations and IBM RISC 6000 for the image server. 
PC’s are required to work under Windows 3 and the 
RISC machine under AIX 3. 

A detailed checklist was drawn up to cover the 
supply of a Document Image Processing system, 
complete with OCR/ICR, text retrieval and image 
enhancement capability. Equipment includes multiple 
optical disc management, scanner, high speed laser 
printer, high resolution colour monitors, and image 
processing and compression cards. Itis not possible to 
give detail here of the whole requirement, but we 
would draw particular attention to the need for a 
sophisticated text retrieval capability enhanced by 
flexible printing options, holding of documents in 
folders, and hypertext browsing. It is specially 
important that the system display should look as 
nearly as possible like a book, allowing the reader to 
place bookmarks, turn pages, navigate from contents/ 
index to pages, manipulate pages and write in the 
margins. Clearly scanning and data entry facilities 
need to offer a large degree of sophistication to support 
this type of display and retrieval. 


4. Negotations with publishers 

Itwasrecognized from the startthat co-operation from 
publishers would be vital to the.success of the pilot. It 
was envisaged that publishers would be very cautious 
about such a development. At the same time it was 
hoped that individual publishers would be sufficiently 
intrigued by the project to take part in its learning 
process provided that adequate protection on copyright 
is given. Happily this has proved to be the case and, to 
date, a dozen publishers have given permission to scan 
over fifty works which are prescribed reading for the 
Business Information Systems course. All the 
publishers involved have expressed keenness to gain 
intelligence on usage and methodologies from the 


project. 
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Conclusion Г 

There has been much speculation and discussion about 
the electronic library in the academic environment, 
but relatively little evidence of the subject being 
addressed in a structured way with specific goals and 


timescales. This project aims to fill that gap both in the 
spirit of exploration and innovation and in the context 
of practical necessity. This paper has given a broad 
overview of the project plans which will be elaborated 
by detailed reports and research papers in due course. 


This paper was presented at the Fifth Dawson' s Research Seminar 
held at Loughborough University on 22 September 1992. 
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Electronic journals — past, present ... and future? 


Dr Cliff McKnight 
HUSAT Research Institute, Loughborough University 


Introduction | 

Many people have suggested that the electronic journal 
would solve several of the problems experienced with 
paper journals. However, although technology has 
developed apace, the commercial electronic journal 
seems to be slow to develop. It is now over ten years 
since the first electronic journal experiments began, so 
perhaps it is time to take stock of what we have learned 
so far. The present paper will therefore begin by 
looking at some of the projects which have finished, 
mention some current projects and then point to some 
remaining problems and issues which must be 
addressed if the electronic journal is ever to become a 
regular feature in the scholarly communication process. 


The past 


BLEND 

The BLEND project (Shackel, 1982; 1991) aimed not 
only to investigate the feasibility of an electronic 
journal but also the feasibility of supporting the entire 
communication process — from authoring and 
submitting, through refereeing and editing, to 
publishing! — via computer. To this end, a central 
mainframe was used and the various participants in 
the process communicated through this machine, with 
the resulting issues of the journal Computer Human 
Factors being stored on it. Users accessed the system 
via a remote terminal either over the Joint Academic 
Network (JANet) or the Public Switched Telephone 
Network (PSTN). 

In at least one respect, Computer Human Factors 
proved superior to a paper journal. Although each 
article was ‘read-only’ once issued, there was space 
allocated for comments to be entered on each article 
and these comments could then be seen by subsequent 
readers of the article. The fact that the articles’ authors 
were also part of the 'electronic community' meant 
that they too could read — and respond to — the 
comments. The resulting dialogue created much more 
of a feeling of ‘live’ research than is possible in the 
paper.medium where the time between submission 
and publication is often over a year. 

While successful in this respect, Computer Human 
Factors was not without its problems. For example, it 
suffered fromthe technology of the day — poor screens, 
lowtransmissionrates and so forth. Movement through 
the articles was slow and so, not surprisingly, users 
preferred to print copies of articles that they wanted to 
read in their entirety. The articles themselves were 
restricted to plain ASCI text and ‘typewriter graphics’. 
Also, at the time of the BLEND project terminals were 
notsoreadily available. Typically, terminals would be 


located in the university’s computer centre (which 
could well be in a different building to the academic’s 
office) and would frequently be in use by other users. 
This is in contrast to the present position where most 
academics have some sort of computer on their desk 
(Shackel, 1990) and in many cases it is connected into 
the campus Local Area Network. 


QUARTET 

Project QUARTET aimed to investigate the 
implications of information technology for the 
scholarly communication process. It was therefore 
somewhat wider than BLEND, being concerned with 
abroad spectrum of communication activities including 
electronic mail, computer based conferencing, 
electronic document delivery, desktop publishing and 
electronic publishing (Tuck et al., 1990). As part of 
Project QUARTET, colleagues and I designed and 
built the world's first hypertext electronic journal, 
HyperBIT°. This was intended not as a replacement 
for the library archive copy of the journal but rather as 
anelectronic version of the personal subscription. Our 
design was based on the results of various earlier 
studies by us of journal usage (e.g., Dillon, Richardson 
and McKnight, 1988) and as such specifically 
addressed the issue of user requirements. Hence, 
browsing through author/title lists at either the issue or 
volume level was supported, as was searching the 
entire contents of the journal. Each article was 
structured using the Guide™ hypertext system and 
cross-references in articles were made into active 
hypertext links, allowing the reader to move quickly 
and easily between articles. (A more complete 
description of the design is given in McKnight, Dillon 
and Richardson, 1991.) 

HyperBIT offered the user several advantages over 
the paper version. Forexample, it was always available 
on the desktop (whereas the paper version might well 
have been borrowed by acolleague). The entire contents 
of the journal could be searched in order to locate, say, 
all articles which mentioned ‘screen’ or referred to 
work by ‘Eason’. The ability to move between related 
articles using the hypertext links was also 
advantageous, as was a pop-up window facility which 
provided instant access to the bibliographic details of 
references without leaving the text^. 

The chief disadvantage of HyperBIT compared to 
the paper version concerned graphics. Although the 
Macintosh system on which it was implemented 
allowed the display of quite sophisticated graphics, 
animation and sound, the screen resolution was far 
lower than the resolution available to produce the 
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average paper journal". In many cases this is not a real 
problem — the line art typical of diagrams and graphs 
presents no difficulty. However, the display of half- 
tone photographs, for example, is not really feasible 
on a standard Macintosh screen. 


ADONIS 

Before I am criticised for mis-representing the 
ADONIS project, let me say immediately thet the 
project itself was not concerned with the electronic 
journal. The principal aim of ADONIS was to utilise 
information technology in order to increase the 
profitability of document delivery without increasing 
the end user price (Campbell and Stern, 1987). To this 


end, a workstation wasassembledcomprisingadesktop . 


microcomputer with built-in CD-ROM drive, a high 
resolution (300 dpi) A4 screen and a laser printer. A 
totalof219 biomedical journals were digitally scanned 
as they were issued and the resulting images stored on 
CD-ROM, yielding one new CD-ROM each week on 
average. The workstations were sited in test libraries 
such as the British Library Document Supply Centre 
at Boston Spa and the idea was that where possible a 
document request would be satisfied by retrieving the 
article from the CD-ROM and outputting it to rhe Jaser 
printer. The normal alternative to this process involves 
locating the journal issue on the shelves and placing a 
marker there, taking the issue to the photocopier, 
copying the article and returning the issue to the shelf 
at the marked place. 

As part of Project QUARTET, colleagues and I 
undertook various investigations of the ADONIS 
system. For example, the feasibility of requesting a 
document via electronic mail and having it delivered 
overahigh-speed telephone line to a local fax machine 
was demonstrated. More interesting in the сспіехі of 
the electronic journal were the studies we made of the 
system as a resource for direct use by academics. 
Ostensibly, such a system might be thought to offer the 
scholar a great deal and the reasons why it failed to do 
so provide valuable insights into the problems of 
electronic journals. 

Like HyperBIT, the ADONIS system suffered the 
problem of screen resolution. Although the system 
used a high resolution screen, biomedical journals 
make frequent use of photographic material which 
couldnotbe displayed adequately. In addition, although 
the system software allowed for searching on normal 
bibliographic details, the fact that the journals were 
stored as bit images meant that a full text search was 
not possible. Display of the pages was extremely slow 
and movement through an article was only possible 
one page at a time so users found it frustrating as a way 
of viewing journals. Also interesting was the finding 

„that 219 journals were simultaneously too many and 
not enough. They were too many in the sense that no 
‘single user was interested in more than a small 
proportion of the 219. However, as a proportion of the 
` biomedical literature as a whole, 219 is only a small 
рап Medline, forexample, covers about 3200 journals 
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— so for most users the system did not contain their 
favourite journals. 


LISTSERV 

In recent years another model of the electronic journal 
has arisen based on the LISTSERV software. This 
name is an abbreviation of ‘list server’ which gives 
some insight into how the system works. In a typical 
system, a central computer holds a list of subscribers; 
when a new -issue is available, the system sends 
subscribers a ‘contents page’ and abstracts via email. 
Subscribers can then requestarticles by ending an email 
message to the server, with the articles being 
automatically delivered as email by the software. The 
Directory of Electronic Journals, Newsletters and 
Academic Discussion Lists (Okerson, 1991) lists 27 
such journals which are now distributed in this or a 
similar manner over the global academic network. 

Although the concept of ‘issue’ is still used, the 
issue itself is effectively unbundled since subscribers 
can request single articles. However, the contents 
pages and abstracts can be stored for future reference 
and searching, and articles сап be retrieved at any time 
on demand. Such a system makes effective use of the 
network ‘bandwidth’ since only requested articles are 
transmitted. 

Unfortunately, in order to reach a wide audience, 
the LISTSERV journals are transmitted in a form 
which makes few assumptions about the type of 
computer which will be used by the recipient. In 
practice, this means that they are usually limited to 
plain ASCII text with fixed line lengths. Hence, the 
appeal to the lowest common denominator is still as 
real today as it was for the BLEND project. 


The Present 

I included the LISTSERV journals in the section on 
the past despite the fact that many such journals are 
currently operating. However, they offer a convenient 
link to the present. Looking at the ARL list gives the 
impression that the electronic journal is at least alive 
and well. Indeed it would appear to be continuing to 
develop and in this section therefore I will outline 
some of the projects already underway or about to 
Start. 


CORE 

The CORE (Chemistry Online Retrieval Experiment) 
project's aim is to deliver a large majority of the 
journal literature needed by one academic area in 
electronic form to workstations in a library and 
terminals on the desks of academics. Articles are held 
inbothtextand bit-map forms anda variety of interface 
options are being investigated (Landauer et al., 1993). 
The CORE project represents a collaboration among 
five institutions: the Cornell University Albert Mann 
Library houses and administers the experiment; the 
American Chemical Society is providing ASCH and 
microfilm versions of the last 10 years of 20 journals; 
the Chemical Abstracts Service provides electronic 
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versions of their hierarchical indexing scheme tagged 
to all of these articles; the Online Computer Library 
Center (OCLC) is contributing expertise in large 
database storage, access and search techniques; and 
BellCoRe is contributing expertise on text and graphic 
conversion and transmission as well as developing 
prototype user interfaces. 


TULIP 

The TULIP (The University Licensing Program) 
project, which was due to start by September 1992, 
also aims to make journals available over the network. 
Elsevier Science Publishers will make 42 of its 
materials science journals available to the 15 colleges 
and universities (including MIT, Harvard, Carnegie 
Mellon, Cornell and Princeton) that are participating 
in the experiment. The project will examine the 
economic, legal and technical issues involved in the 
electronic transmission of journals as well as 
considering user issues. In the first instance the journals 
will be stored as bit-maps (as was the case with 
ADONIS) which will severely restrict the ability to 
perform searches at the document level, although 
bibliographic searching will be possible. 


OJCCT 

The American Association for the Advancement of 
Science (AAAS) and the Online Computer Library 
Center (OCLC) have also launched an electronic 
journal, The Online Journal of Current Clinical Trials. 
However, there are very few articles on the system yet 
and growth does not seem to be as rapid as had been 
expected. In addition to the technical problems which 
initially delayed the project, Wilson (1992) reports 
that “the AAAS must persuade authors to submit high- 
quality papers in a new medium that may prove to be 
largely ethereal.” Although this problem will obviously 
decrease as the number of quality electronic journals 
increases, it seems that the situation is still not much 
different from that experienced by the EIES project 
mentioned earlier. 


IOPPILUT 

Lestit seems that currently only the USA is interested 
in the electronic journal, I would like finally to mention 
a British project involving the Institute of Physics 
Publishing and Loughborough University with support 
from SCONUL. This project, funded by the British 
Library Research and Development Department (as 
were BLEND and QUARTET), will look at a variety 
of economic, technical and user issues involved in the 
electronic distribution of a journal. In the TULIP 
project, collaborating libraries will receive the 
electronic version of the journal free of charge if they 
already subscribe to the paper version. However, in 
the British project , both the paper and electronic 
versions will be provided free of charge to collaborating 
libraries (including at least one industrial library). 
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The future 

The paper journal will clearly be with us for many 
years to come. While the developed nations may be 
pursuing various networking policies, the paper 
journals will at least continue serving those countries 
still lacking the basic levels of electronic information 
technology. However, it is also likely that the electronic 
journal is here to stay. What, then, stands in the way of 
the proliferation of electronic journals? 

There are not many technical problems remaining, 
although some of the desired technology is still too 
expensive for widespread use. For example, it is 
possible to display very high resolution colour graphics 
on acomputer screen, but the cost of such screens and 
the associated hardware are prohibitive at the present 
time. Sending such images across a wide area network 
is currently possible but impractical. Also, a single 
graphics standard (the graphic equivalent of ASCII) 
would be useful so that fewer assumptions had to be 
made about the recipient machine. A portable electronic 
journal would also be desirable and may well be 
developed in the wake of the current interest in portable 
electronic books. 

With aspects such as graphics and portability, the 
basic technology exists already. In many cases, though, 
decisions need to be made about how to make sensible 
use of the technology. For example, journals like those 
in the BLEND project and on LISTSERV systems 
make use of a central host computer. However, it 
could be argued that this is inefficient because it uses 
the network bandwidth to deliver the same thing to 
many different places. Even with the planned upgrading 
of the networks (e.g., SuperJANet), it could be argued 
that the increasing demand will mean that even 
improving the networks will only allow us to maintain 
current services rather than improve them. 

If we wantmore than ASCII-only electronic journals, 
then, perhapsthey should be distributed like HyperBIT 
was intended to? This raises the question of where to 
distribute the journal to. For example, HyperBIT was 
conceived as being distributed direct to end-user, but 
the TULIP and IOPP journals will be distributed to 
libraries, thereby maintaining the library role in the 
communication chain. 

In addition to such issues, there remain ПОРТЕ 
associated with concepts such as copyright and payment 
structures. The question of copyright control is of 
particular concern for publishers. Although the paper 
medium is relatively easy to copy using a photocopier, 
the resulting copy is of inferior quality to the original. 
In the electronic domain, copying is not only easy and 
fast but also the resulting copy is identical to the 
original. If I receive an article over the network, it 
takes me no more than a few key-presses to forward a 
copy of the entire article to someone else. This means 
that either methods of electronic copy protection must 
be developed or the concept of copyright must be 
reconsidered. 


Electronic journals 


Indeed, many authors are questioning the need to 
assign copyright to publishers now that they have an 
apparently free distribution network. In the paper 
domain, if I order a journal I either have to pay for it 
myself (a personal subscription) or I must have 
agreement that the department or university library 
will pay. If I receive a LISTSERV journal, however, 
it is not clear who pays. Certainly there are costs 
involved, but they are costs which are largely 
transparent to the user. (The storage costs are met by 
the host instituion, usually a university and in this 
respect we may be witnessing a return to the situation 
in which universities were also publishing houses.) 
My access to the network is paid for as part of the 
general funding for computing within the university 
and I don’t receive a bill. Hence, it may well prove 
necessary to develop new costing models for the 
production and distribution of journals in the electronic 
domain. 

It is clear, then, that it is not a technological barrier 
which stands in the way of the success of electronic 
journals. However, even if all the aforementioned 
problems are solved, the success of the electronic 
journal will still depend on the user. Simply being 
electronic is not enough — users will only adopt 
electronic journals if they offer at least as much as the 
paper version, preferably more, including offering 
such important quality control mechanisms as the peer 
review or refereeing system. It is usability, not 
technology, which will determine the success or failure 
of the electronic journal. It only has a future if we get 
this right! 
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Notes 


1 The BLEND project was keen to stress that 
articles were ‘archived’ rather than ‘published’ 
because the earlier EIES project in America had 
stumbled over just this point. For many 
academics, publishing is an important part of 
career development and the American project 
had found that academics were unwilling to 
submit papers to an experimental journal. 

2 Thejournal used was Behaviour and Information 


Technology (BIT), published by Taylor апа. 


Francis whom we gratefully acknowledge for 
allowing its use. 

3 This facility was provided on the basis of 
observations of many users who would keep a 
finger permanently in the References section of 


the article when using the paper version, turning 
tothe section when they encountered areference 
in the text and then returning to the text. In this 
sense it provided what Brian Shackel termed an 
‘electronic finger’. 

4 Typical screen resolution is 72 dots per inch 
(dpi) whereas a typical typesetting machine has 
a resolution of 1200 dpi — roughly 16 times 
better! 

5 The term ‘bandwidth’ refers to the amount of 
information per unit time which can be 
transmitted over the network. In hydraulic terms, 
it is analogous to the diameter of the water pipe. 
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Abstract 


It is argued that the current generation of online catalogues do not meet basic user expectations about how to search 
for information. After a brief examination of virtual reality and its associated technology,anew formofonline catalogue, 
the virtual reality library is proposed. Users browse an information space, a computer- controlled set of shelf orderings 
for items. Its form, workings and design are investigated in detail. The concept of the virtual reality library is then 
applied to information resources which either have no physical repository or have one which is not accessible to 


users. 


Introduction 

Libraries provide access to information by a 
combination of making their collection available on 
open shelves and creating an enriched catalogue of 
their collection which can be consulted. This latter 
method is computerized in most libraries. These two 
access methods have different strengths and 
weaknesses. A summary of the characteristics of each 
method follows:— 


Online catalogue Library shelves 
queried browsed 
complex simple 

diverse homogenous 
discrete continuous 
drab eye-catching 


Online catalogues expect the user to construct some 
form of query or statement of user’s information need. 
Why should users have to query to search? According 
to my dog-eared Oxford Concise English Dictionary 
the word ‘search’ means: 


‘examine thoroughly (place, person, etc) for what 
maybe found orto find something of which presence 
is known or suspected’ 


This typical definition of searching makes no 
mention of querying ie thinking of terms to describe 
the object of search, in this case some item or items in 
a library's collection. Why have ‘search’ and ‘query’ 
become almost synonyms in this situation? When a 
person goes to abookshop they very rarely consult any 
catalogue of books for sale (like Books in Print on 
microfiche). Rather they browse the shelves, content 
to be guided to what they want by author arrangements 
or general subject groupings, signed by shelf labels. 

Browsing can be generally defined as wandering 
around a physical location seeking some item or items 
that may be stored there. It is a remarkably pervasive 
human activity: we organize the contents of our homes 
and workplaces so that ‘everything has its place, a 
place for everything’, in order that we can find things 
again. We obtain the necessities of life by browsing 
enormous, ever-changing ranges of goods in a 
multiplicity of locations (by shopping). 


Inalibrary context Apted! defines ‘general purposive 
browsing’ as:- 


‘planned or unplanned examination of sources, 
journals, books or other media, in the hope of 
discovering unspecified new, but useful, 
information’. 

Although searching is going on, there is no 
requirement for query formulation. A browser may 
have in mind a word or phrase for the subject of the 
information they seek, or they may not. Items in libraries 
have prominently displayed titles, and, more generally, 
shelves usually have signs summarizing the subject 
range of their contents. The browser is continually 
scanning these visual clues, until one triggers the 
realization thatacloserinspection might be worthwhile. 
An online catalogue can list items having the same 
author, the same index term etc. They can be browsed, 
but looking at a scrolling list of, say, titles on a screen 
is not as comfortable nor as quick as glancing along a 
shelf. 

When the ordinary person searches for information 
they browse, first their home or workplace and later 
possibly a library or bookshop. The majority of users 
of public libraries very rarely use the online catalogue 
but go straight to the shelves. The library staff are 
typically the chief online catalogue users. Browsing is 
a perfectly usable strategy on its own: I went through 
university with a friend who never used the catalogue 
and successfully got a degree. 

A concomitant problem of the query approach to 
information is the complexity this entails. Querying 
puts an enormous burden on the user, who has to 
conceptualize search terms, combine them with 
Boolean operators and then scroll through an unordered 
setof items, which may be too large or too small. These 
drawbacks of Boolean searching for inexperienced 
users have been widely discussed^^. 

Computers are notorious both for the cryptic nature 
of the interactions they conduct and the punctiliously 
correct syntax they demand in dialogues. Querying is 
essentially unnatural: itforces aperson tocommunicate 
on the computer's terms. It has to be learnt, and many 
people never acquire the skill. Borgman’ found that 
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one quarter of a group of inexperienced searchers 
were unable to pass a bench-mark test of minimum 
searching skill. 

Browsing, however, is simple and depends solely 
ona person's innate ability to recognize some intrinsic 
ordering. It is this ordering which makes browsing a 
viable strategy*?. Because similar items are stored 
together browsing involves looking for one relevant 
item, on the assumption that its neighbours might also 
be relevant. While shelf orders can be understood by 


common sense reasoning, working on the example of | 


ordered items in sight, no such strategy exists for 
online catalogues. The onus is on the user to build up 
a conceptual model in their mind of how the online 
catalogue works. 

Compounding the complexity of individual online 
catalogues is the babel of differenttypes. Although the 
principles are similar, menu structures, command 
syntaxes etc are different. Being able to use one such 
system is no absolute guarantee of being able to use 
another. A shelf is a shelf: they all work the same way. 

Ona shelf one can store literally anything (physical 
size permitting) eg books, videos, photographs, 
computer discs, CDs, records, etc. Many different 
types of item can be stored іп а homogenous way. Тһе 
items retrieved via an online catalogue are homogenous 
only in the sense that they take the form of text (from 
the bibliographic descriptions) ona screen. These items 
have no physical locations that can be used to find 
them again later. They have no physical appearance to 
stick in the mind. For these crucial aids to memory the 
online catalogue user must go to the shelves and 
browse, albeit with the knowledge of a class mark to 
speed up the process. 

Adding to the problems of complexity and diversity 
of online catalogues is the lack of continuity a user will 
experience in their use. A user queries an online 
catalogue. After a period they return and have to redo 
the query. This is an enormous chore, as it assumes 
that the user will be able to remember how to perform 
the query. They also must recognize from any 
references retrieved those not previously seen. 

Once the location of relevant items has been found 
оп а shelf, a user need only to remember that location 
and retum there to redo the search. New items that 
might be relevant ought to be shelved at that location, 
so they are automatically found by a later search. Items 





encouraged to do this sort of thing to library shelves. 

A shelf does surprisingly well in the above 
comparison. The reason is that certain fundamental 
disadvantages have been omitted:— 


1. Not all the items in a library’s collection are on 
the shelves due to being out on loan, being 
consulted by a user, sent for binding, missing 
due to theft etc. A record for an item is always 
present in an online catalogue. 

2. An item typically appears in only one shelf 
location. Even multiple copies are shelved 
together. For access points like author, title or 
subject the user has to make recourse to the online 
catalogue. 

3. Linear order is always constrained by limitations 
on the library space available (eg shelves cannot 
be too high as they then could not be used without 
ladders) and by the size and shape of items to be 
stored, so that separate linear orders are necessary 
for oversize books orrecords. By reducing items 
to text descriptions an online catalogue can 
provide complete linear orderings if index entries 
are listed. 

4. Shelves have to compete for space with other 
library services and functions, which is at a 
premium in most libraries. Users do not have 
access to all the collection as closed access stacks 
for extra storage are usually necessary. Online 
catalogues do not take up much space. 


These disadvantages comprehensively rule out shelf 
access alone in libraries. However there is an obvious 
symbiosis between the function of the shelves and that 
of an online catalogue in a library. The expected norm 
is of open access to the shelves coupled with access to 
acatalogue. The catalogue, consisting of textual records 
for items, can be easily computerized using a 
specialized database or text retrieval system. Until 
recently there has been no technology available for 
computerizing the information space described by a 
library’s shelving. 

A computerized information space would be a 
display in three dimensions whose structure and 
appearance would be determined by an information 
resource, in exactly the same way as the shelving 
arrangement in a library is related to the collection it 
is housing. How would such a technolo 


main consequence of replacing physical space with a 
computer-generated version is that all of the 
disadvantages of shelves in physical space disappear:- 


1. All of the items in a collection are permanently 
on display, as they are images drawn from data 
in bibliographic records. The physical 
description present in bibliographic records 
could be used to give such item images a 
consistent and realistic appearance (although 
not one that matches the actual item). 

2. Because each item is an image, it can be drawn 
as many times as required. Hence the same item 
could appear in any number of different shelf 
orderings and at many different positions in each 
of those shelf orderings, as appropriate. Shelf 
orderings for author, title and subject would 
remove the need for a separate catalogue. 

3. Display of items in a linear order on shelves is 
not in any way limited. Oversize or non-book 
items can be interfiled with ordinary books, as 
only images are involved. Shelves can be in any 
pattern or organization, from the traditional bay 
arrangement through to the wildly surreal and 
bizarre. The principle of design would simply 
be to aid browsing, with no other constraints*. 
Limitations of space disappear, as the 
information space could be devoted solely to 
support browsing. 


A system able to display information space would 
close the division between shelf access and catalogue. 
While not permitting access by query, all the access 
points present in a catalogue would be available for 
browsing. Existing bibliographic records could be 
used to build an information space, which would closely 
resemble traditional library open access shelving, and 
thus would provide an ideal browsing environment, 
one truly in line with user expectations for searching 
for information! 


Virtual reality 
A technology which could construct an information 
space resembling library shelving has recently 
appeared. Virtual reality is a term used to describe 
worlds which can be created by computer. Humans 
can be totally immersed in these computer generated 
worlds by the real time generation of stereo images 
which are then viewed through head mounted 
displays”. Interaction with the world can be achieved 
with three-dimensional input devices like 'data 
gloves’'*, These translate hand gestures into control 
functions, and give the user a presence in a virtual 
world by means of a floating hand. Nugent” provides 
a useful summary of the origins and development of 
virtual reality. Current application areas forimmersive 
virtual reality involve telepresence, military simulators, 
architectural design and leisure!" 

These applications of immersive virtual reality 
involve the use of extremely expensive, custom- 
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designed systems. They have to be as the hardware 
power and software complexity to generate, in real- 
time, anything like realistic worlds stretch current 
computer technology to the limits. Systems with this 
capacity are likely to be prohibitively expensive for the 
next few years. The major players in the immersive 
virtual reality arena are large research laboratories in 
America and Japan. These are supplied by a range of 
small start-up companies supplying either peripherals 
(eg the Data Glove from VPL) or specialised software 
(eg the WorldToolKit from Sense8). 

Currently in the UK two suppliers of immersive 


` virtual reality systems exist. Division, based near 


Bristol, uses a system unit containing a number of 
transputers running in parallel, controlled by a novel 
real-time operating system, to provide the power for 
virtual world building. Rendering (ie filling in the world 
outline with realistic surfaces and shading) is done by 
special graphics processors from Toshiba. The system, 
known as Provision, can accommodate a range of 
headsets and input devices. Division have sold systems 
to large academic research establishments and to 
commercial concerns, although of the latter they prefer 
not say which. Division has recently merged with two 
US companies based in California, to give it a foothold 
in the American market. 

The other UK supplier of immersive virtual reality 
equipment has aimed at a different market. W 
Industries, based in Leicester, supplies virtual reality 
games to arcades, entertainment centres and nightclubs, 
both in this country and abroad. Their system is based 
onacustomized Amiga 3000 and a Texas Instruments 
graphics processor for rendering, and their own 
proprietary world building software and range of 
headsets and consoles (either sit down units that look 
very much like truncated racing cars or stand up units 
that surround the user in a protective ring). While they 
have sold ahandful ofsystems forresearch their systems 
are aimed squarely at the leisure industry. 

Because immersive virtual reality is so expensive 
and the virtual worlds it can build are still somewhat 
primitive and difficult to interact with, an alternative 
approach to virtual reality has appeared, desktop virtual 
reality. These systemis run on desktop computers 
(albeit powerful ones) and consist of a large, high- 
resolution, colour monitor to present realistic three- 
dimensional images and devices like the ‘spaceball’ (a 
mounted ball which can be rotated to set direction and 
speed, and which possesses buttons forcontrol actions) 
or the ‘3D mouse’ for navigation and control. Desktop 
virtual reality systems can be upgraded to immersive 
systems if need be by using a headset and more 
sophisticated interaction devices, like the data glove. 

The only UK based supplier of desktop virtual 
reality systemsis Dimension Intemational. They supply 
a programming tool kit for building virtual worlds, 
Superscape, which is based on an earlier product, 
Freescape, fordesigning games with three-dimensional 
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graphics. Superscape runs only on high-end PC 
compatibles. 

At Virtual Reality '92, back in April, Imade contact 
with Virtual Presence, a company which distributes 
the Sense8 WorldToolKit, mentioned previously, a 
very reasonably priced development system which 
can run on 486-based PCs, with an Intel DVI board for 
rendering. If more machine power is needed the 
WorldToolKitalsoruns on workstations from Sun and 
Iris. The WorldToolKit consists of routines to draw 
three-dimensional graphics and to manipulate them. 
Virtual worlds can be constructed from libraries of 
these graphics. Drivers are included both for complex 
interaction devices like the data glove, but also for 
basic ones like the spaceball. 

To build an information space for library browsing 
there is no need for complex interaction devices like 
the data glove, which typically provide for complex 
control gestures and tactile feedback. The resolution 
of currently available headsets is too low to display 
text adequately and the immersive aspect of headsets 
does not offer much without the incorporation of text. 
A desktop virtual reality system, like the one offered 
by Virtual Presence, would be an ideal development 
platform. 


The virtual reality library 

So far, I have attempted to make a case for browsing 
shelves as a powerful metaphor for a new farm of 
computerized access to library collections, and then to 
presentan analysis of currently available virtual reality 
technology, inorderto show thata practical, affordable 
and expandable development system exists for such a 
development. I would now like to introduce a term of 
my own, the virtual reality library, to describe such a 
system. The following explanation of the workings of 
this novel system assumes that the bibliographic data 
from a typical library collection is being accessed via 
an interface which looks, on screen, like a roomful of 
shelves, and which a user would navigate and control 
using a device like a 3D mouse. 

The virtual reality library would be composed of 
images of shelves and items, and background walls, 
floors and ceilings, all rectangular shapes. Since virtual 
worlds are computer generated from large numbers of 
polygons (currently available systems draw up to 
35,000 polygons per second) a virtual reality library 
presents fewer problems than existing applications in, 
say, flight simulation, where natural shapes like trees 
are difficult to draw. Requirements for shading effects 
of light in the virtual reality library would be minimal, 
while there would beno necessity for detailed animation 
of individual objects. 

Items themselves could be displayed by using their 
physical description field for their basic dimensions 
and appearance and a hashing algorithm to generate a 
spine colour from their publisher name. The hashing 
algorithm would drop common words from a 
publisher’sname and transform the remaining textinto 
a colour. For extra item differentiation a series name 
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(if any) could also be hashed by the same algorithm to 
generate another colour for a separate band on the 
spine. These operations would give items a consistent 
andrecognizable appearance. Oversize items could be 
trimmed to fit the shelf while non-book items could be 
indicated by an appropriate image in place of a book. 

To browse the virtual reality library the user would 
use the 3D mouse to move around. Initially the user 
would find themselves in a large room in which items 
on sbelves were in classified order, as in a typical 
library. Unlike a physical library, items in the virtual 
reality library could appear in this classified order in 
asmany places as their bibliographic record contained 
classification codes. This room would be the main 
room of the virtual reality library. 

Doors off this main room would lead to author, title 
and subject shelf orders of items. Multiple images of 
the same item would be present in the author order (if 
they had more than one author) and in the subject order 
(one image per subject term or heading or PRECIS 
entry). Optionally another door from the main room 
would lead to a shelf order based on another 
classification scheme, if all items had the requisite 
classification codes. 

Unlike a physical library, items would always be on 
the shelves and always be correctly located on those 
shelves. The appearance of items on a shelf outside of 
closeviewingrange would be displayed as a featureless 
set of book spines. When the user moved into viewing 
proximity of the items each would be drawn using the 
procedure described previously, with the addition of 
texton the spines for author and title and, if necessary, 
the data which determined the shelf position (either ` 
classification code or subject). The text on the spine 
could be in the language of the item, to identify non- 
English language items. Optionally the date of 
publication could be added, to enable age judgements. 
The user could browse these items and choose to view 
the bibliographic details of any one of them by clicking 
once on its image. 

The item would remove itself from the shelf and 
opento revealits bibliographic details. These could be 
displayed in a pseudo title-page format, that is headed 
by acentred title, with authorship details immediately 
underneath, publication information at the bottom, 
etc. If the user moved away then the item would be 
returned to the shelf. Also single clicking on the item 
would put it back on the shelf. 

Double clicking on the item would also put the it 
back on the shelf, but would allow the user to indicate 
that the previously viewed item was relevant in some 
way. This double click action would have two 
consequences. Firstly, the item's details would be 
added to a list which would be printed out for the user 
at the end of that user's session in the virtual reality 
library. Secondly, the item would be highlighted in all 
its other shelf positions (at least one per room). The 
rationale here is that if an item is relevant then its 
neighbouring items are also likely to be relevant. Thus 
finding a relevant, known item in the author sequence 
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would lead to the position of that item in the classified 
and subject sequences being highlighted. The user 
could then move to these highlighted sections for 
more, hopefully productive, browsing. Highlighting 
could be done by either making item images glow, 
image blinking, or whatever method was found most 
appealing by users. 

If a thesaurus structure was available then shelves 
containing items with subject terms linked by the 
thesaurus to any subject term in the chosen item could 
also be highlighted. In this way browsing could be 
directed to shelves containing items on broader and 
narrower subjects. 

Browsing inside a virtual reality library could further 
be aided by constant guiding. It could offer audio help, 
which would be activated by the user’s progress 
through the virtual reality library. Digitized speech 
would tell the user how to view items, what rooms 
contain etc. There would be limits on the number of 
times a piece of audio help could be triggered, to avoid 
annoying users. Digitized speech would be produced 
via a sound board in the computer. 

Of vital importance in constant guiding would be 
visual signs, mainly textual, but they could be pictorial 
or iconic if needed. Visual signs would resemble those 
in a physical library, in that each bay of shelves would 
have a sign indicating its contents. Each shelf could 
also have a sign indicating its contents. These signs 
would only be visible if the user could see the shelf but 
was not near enough to actually see its items in detail 
— that is the identifying text on their spines. Each door 
would have a sign indicating the contents of the room 
it lead to. 

The functioning of a hypothetical virtual reality 
library has been described in detail, making clear the 
way it matches the browsing activity in physical 
libraries, thereby allowing users to transfer expectations 
and actions from their experience of browsing physical 
libraries. The simplicity of the virtual reality library 
can be demonstrated by the following user guide, 
which assumes entrance and exit doors in the main 
room:- 


1. Touse the system, enter through the door marked 
‘Entrance’. You are now in the main room of the 
library. 

2. Browse around the rooms in the library, 
following signs and verbal prompts to shelves 
that may contain items of interest. 

3. Move close toa shelf to see the items it contains. 
Single-click on an item in a shelf to see all 
details for that item. 

4. Single-click on an item’s details if it is not 
interesting to put it back on the shelf. 

5. Double-click on an item’s details if it is 
interesting. It will be put back on the shelf and 
other shelves in all rooms containing copies of 
that item will be highlighted. Now browse these 
shelves. 
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6. To leave the system, exit through the door 
marked ‘Exit’ in the main room of the library. If 
you have double clicked on any items, their 
details will be printed out for you. . : 


The virtual reality library possesses all the advantages 
that traditional shelf access had over typical online 
catalogues, and none of the disadvantages. In fact it 
includes many of the functions of the online catalogue, 
but translated into shelf browsing, like finding known 
items in the classified sequence. No doubt experienced 
online searchers will bemoan the loss of query facilities, 
but will ordinary users, bearing in mind its simplicity? 


Applications of the virtual reality library 

Atfirst sight the obvious application is as areplacement 
for an online catalogue in a library. However, in a 
typical library there will already be shelves to browse. 
Having to duplicate this arrangement precisely in the 
virtual reality library (so as not to confuse users) 
would require an enormous amount of effort, which 
would have to bere- expended foreach library covered. 
An even worse drawback is that items in the virtual 
reality library cannot look as they would do in the 
physical library, as there is not the detail required in 
their bibliographic descriptions. The only type of library 
which would benefit from an on-site virtual reality 
library would be a closed access one. 

Considering the growing popularity of CD-ROM as 
an information resource, and the impossibility of shelf 
access to such a resource currently, then this becomes 
a useful trial application area. A virtual reality library 
for a CD-ROM could be compared to its standard 
retrieval interface for а meaningful evaluation. In 
collaboration with acolleague, Anne Morris, Icurrently 
have a request for funding for such a project under 
consideration by the British Library Research and 
Development Department. 

What about library collections accessed via a 
network, the virtual library? A virtual reality library 
would again accommodate browsing for shelf access 
amongst the physical collections in a virtual library (1 
hope that the distinction between virtual library and 
virtual reality library is clear—perhaps this application 
should be termed a virtual reality virtual library!). The 
problem here is that bibliographic data cannot be 
stored within the virtual reality library, but must be 
retrieved viaanetwork. Response time over thenetwork 

will be a major problem, unlike the case of CD-ROM, 
‘which serves as local storage. Applications here will 
depend on network advances. There is another problem 
with the virtual library. Itcomprises anumber of library 
collections. So far the assumption has been that a single 
collection only has been behind the virtual reality 
library. Since each library collection in the virtual 
library is housed in a physical library somewhere then 
perhaps a virtual reality library could be used to build 
replicas of all these physical libraries? Appealing 
initially this route has two difficulties. The first is the 
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immense task involved in mapping the floor plans of 


all these physical libraries and turning the floor plans 
into virtual reality libraries. This operation will only 
create one shelf order (the classified one) for each 
physical library and other shelf orders will be needed. 
The second is that a user unfamiliar with a distant 
library's floor plan would gain nothing from a virtual 
reality library which strove to emulate it. 

An initial approach would be to build a single virtual 
reality library, which would, according to the user's 
choice of library over the network they wished to 
access, be populated with items from that chosen 
library. These items would be obtained by querying 
the online catalogue ofa physical library via е network 
connection. This would homogenize to a certain extent 
all the collections available via the virtual library from 
different physical libraries: they would all be housed 
in the same virtual reality library. 

It shouid be recalled that, unlike a physical library, 
the appearance of a virtual reality library is completely 
under the control of the system designer. One way in 
which this control might possibly be used to enhance 
the virtual reality library for this application, without 
making its appearance or operations off-putting to a 
user, would be tocreate a standard set of shelf orderings, 
in which the designer determined the range of contents 
of each shelf. Thus a certain shelf, in a certain bay, in 
the classified order of the virtual reality library, might 
be set to contain items on, for example, the French 
Revolution. If a user got to know where this shelf was, 
then for each library over the network they accessed, 
items on the French Revolution would always be on 
this shelf. In this way the complexity of accessing 
different library collections would be completely 
hidden from the user. The contents of a shelf. when 
requested for browsing by a user, would be obtained 
by querying, via the network, the online catalogue of 
the physical library concerned. 

Oneobjectionis thateach library will havea different 
number of items on say the French Revolution, and 
thusthe shelf size needed to accommodate them would 

' change. For a virtual reality library, a shelf can be 
sized as required. If a shelf needed extending on the fly 
to accommodate a lot of items then this and 
corresponding size changes on other shelves, shrinking 
orexpanding then asrequired, could be made. The idea 
of a single virtual reality library with fixed shelf 
contents, accommodating a succession of different 
collections, and homogenizing them for user access, is 
an intriguing one. 

Such manipulations do not fall outside user 
expectations. Others are possible which do. As a user 
of libraries I am aware that there are sections of the 
library which I habitually use and others in which I 
haveno interest. In a virtual reality library a user could 
be given control over the layout of the library. Shelves 
containing uninteresting material could be deleted by 
the user. In addition shelves containing relevant items 
could be moved to a preferred location, for example a 
personal room (ora virtual carrel?) in the virtual reality 
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library. An extra ‘exit’ docr could allow a user to save 
their modified library Іауо и, under a mnemonic name 
which could serve as a filename for their configuration. 
An extra ‘entrance’ door would permit a user to enter 
this mnemonic name to reactivate their earlier 
configuration. І 

In the introduction I mentioned that physical shelves 
can have their appearance altered. So can the shelves, 
and indeed all the elements, in a virtual reality library. 
Do you prefer carpets to be a soft green and walis and 
ceiling a restful purple — no problem! The entire 
ambience of the virtual rea_ity library could be open to 
user choice. The virtual reality library could look like 
a Babylonian library, with stone tablets on its shelves, 
or like the British Museum Reading Room, all plush 
baize and wroughtiron. Firally users could be allowed 
to add small embellishments of their own: a fountain 
here, a marble statue of Venus there. These 
embellishments could either be selectable from within 
the virtual reality library, orcreated by importing three- 
dimensional image files (eg .GXF files from 
AutoCAD). There is a serious purpose to this: it gives 
the user the chance to personalize their browsing 
environment. Adding statues at strategic locations 
would be an excellent aide-memoir to the location of 
useful sections for browsing. 

The arrangement of shefves in physical libraries in 
serried ranks is designed toeconomize on space rather 
than serve accessibility. Another drawback of browsing 
in a physical library is the crick in the neck one gets 
from reading vertical spines. In a virtual reality library 
the designer is free to arrange shelves, and books on 
shelves, in whatever layout they feel best serves 
browsing. I propose a vircual reality library shaped 
like a hub with spokes. Tke hub is the central room, 
from which the inner surface of each hollow spoke can 
beseen. The inner surface of each spoke would contain 
a shelf ordering, following a spiral path out from the 
hub. Individual shelves wculd be vertical, so that the 
items on them would be in horizontal stacks, to aid 
spine reading. 

Of course, the end result of all this is a totally 
fantastical library. This hes precursors. Jorges Luis 
Borges, director of the A-gentine National Library 
and noted fabulist, describes an infinite library, made 
of interconnected hexagonal galleries, in the story The 
Library of Babel'*. This story presents life as a fruitless 
search for knowledge in an ironically inimical 
environment, a library. In the Library of Babel people 


. spend their lives wandering its endless shelves, on 


which randomly arranged. books contain randomly 
patterned letters, which may or may not make sense as 
words. Suchalibrary could be constructedusing virtual 


, reality, but perhaps only as a dire warning. 


Conclusion 

It is very easy to get carried away with visions of what 
virtual reality technology might do. Spring? has written 
about virtual reality libraries with visible hypertext-like 
links between related iterrs and colours to indicate 
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levels of relevance, while Seiler and Surprenant” have 
imagined the ‘virtual reality information centre’, where 
all the information one could possibly wantis delivered 
by an ‘infinite library’ in one’s home. The strangest 
vision is that of Benedikt”), who has envisaged a slide 
library that looks rather like a 3-dimensional graph: 
choosing a point in this graph opens up a window onto 
the items located there. However, these visions are 
well ahead of the available technology and need 
substantial expansions in bibliographic records, and 
therefore are unrealizable in the near future. The 
approach in this paper is to investigate what is feasible 
using existing technology and bibliographic records. 
There is a basic need for browsing which can be met 
for certain types of library collection, either lacking a 
physical repository or possessing one which for some 
reason is inaccessible, by using virtual reality. Virtual 
reality technology can create such a repository, more 
expansive and powerful for browsing, than a physical 
repository would be. Libraries, since their earliest 
days, have been designed to allow access by browsing 
their shelves, and so a virtual reality library is merely 
a new application of an old idea. 
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Abstract 


In a hypertext database, the information is presented as a network of nodes connected by links. Such nodes may be 


text, graphics, audio, video, and even other software. 


Although hypertext provides a new approach to information management, it also leaves a whole new set of 
problems for the designers of the hypertext database to solve. As the volume of information grows, the task of 
authoring a hypertext database becomes much more complex. 

In this article, the author presents the experiences during the development of a hypertext version of the user’ s guide 
for information services on JANET, in the UK, by using HyperPAD, a hypertext shell for the IBM PC. It may be the 
first step to explore the proper way to solve those problems which come together with the increasing application of 


hypertext. 


Introduction 

Hypertext is usually defined as the nonlinear 
organization and viewing of information on a 
computer" Nonlinear means that you can read the 
stored information in any order you wish by selecting 
the topic indicator on the screen. 

Hypertext as an idea has existed for more than 40 
years. In 1945, Vannevar Bush, a science advisor to 
US President Roosevelt, described this idea in an 
article entitled As We May Think. He designed a 
machine call *Memex', which contained microfilm 
copies of all scientific information”. 

But the term ‘hypertext’ had not been used widely 
until Ted Nelson coined and popularized itin his many 
lectures and articles. 


The first hypertext system implemented was NLS | 


(now called Augment) during Ше 196052, Since then, 
many hypertext systems were created as experimental 
systems. In 1987, the first major conference devoted 
to the topic of Hypertext, Hypertext 87, was held at the 
University of North Carolina in the USA. This 
conference brought together hundreds of scholars 
interested in hypertext research and made itself a 
milestone in this field. The intensive research and 
development activities are leading to many 
commercially available systems such as HyperCard 
for Macintosh, Guide for IBM PC. HyperPAD is a 
relatively new hypertext shell developed by Brightbill 
Roberts, Ltd and version 2.0 came on sale in 1990*. 

Hypertext has many potential applications that are 
just beginning to be explored. It can be used to create 
encyclopaedias, technical documents, teaching 
materials, and product catalogues. In some specialized 
areas such as creative writing, computer-aided software 
engineering, online help systems etc, hypertext is 
being examined as a powerful tool!. 


Although the term ‘hypertext’ has been with us for 
more than two decades, there are still many issues 
which are not understood very well and are currently 
being researched. For system design, the issues 
attracting intensive attentions include annotation, 
display capabilities, versioning, integration, 
performance, printing, networking, and usability. For 
implementation they are chunking, navigation, 
security, etc.> 

Inorderto get hands-on experience on the application 
of hypertext systems and have an in-depth view of the 
above-mentioned issues, the author launched a project, 
the aim of which was to create a hypertext version of 
the user’s guide for the information services on JANET 
by using HyperPAD, which was kindly provided by 
Professor J Meadows when the author worked as a 
visiting fellow in the Department of Library and 
Information Studies at Loughborough University of 
Technology in the UK. This user’s guide can also be 
used as self-teaching material for the students of the 
masters course. 

This article is a brief summary of what the author 
discovered during the development of this hypertext- 
base user’s guide. 


HyperPAD 

Almost two years after HyperCard appeared on the 
Macintosh in 1988 the IBM PC users began to taste a 
similar hypertext shell—HyperPAD on their machines. 
HyperPAD was developed by Brightbill-Roberts Ltd 
and version 2.0 came out in 19905. 

Though different terms are used for CARD and 
STACK (in HyperPAD they are called PAGE and 
PAD respectively), HyperPAD may be considered as 
an IBM PC version of HyperCard”: but it is in colour, 
and has some additional features. As HyperCard is 
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well known in the hypertext world, only the main 
differences between HyperCard and HyperPAD are 
mentioned here as a brief introduction to HyperPAD. 


Colours 

Probably envied by HyperCard users, HyperPAD can 
generate dazzling, colourful screens. There are 16 
foreground colours and 16 background colours. The 
usercan select any of 256 combinations froma ‘palette’, 
and paint the objects (buttons, fields, background , 
text, etc) everywhere on the screen. 


Visual effects 

More visual effects are available in HyperPAD (26 
options, in contrast to 17 options in HyperCard), and 
they can be controlled precisely in speed. Instead of 
replacing one screen with another, the user can select 
whatever visual effect s/he likes from the following 
options when he wants to alter the manner in which a 
screen is replaced. 


EFFECT OPTIONS DIRECTIONS 

Peel 4 Upper right(left); Lower right(left) 
Scroll 4 Left; Right; Up; Down 
Wipe 4 Left; Right; Up; Dowa 
Weave 2 Vertical; Horizontal 
HSplit 2 In; Out 

VSplit 2 In; Out 

Box 2 In; Out 

Spiral 2 Clockwise; Counterclockwise 
Quad 1 

Fade 1 

Drip 1 

Blink 1 


(only SCROLL, WIPE, BOX and FADE are available 
in HyperCard) 

Moreover, the speed of visual effect can be controlled 
at the precision of milliseconds. (In HyperCard there 
are only 3 speed options, i.e. Very Slow, Slow, Fast). 
This is very useful when you want to keep the visual 
effect synchronous with the sound effect. 


Run a program 

At any point of running HyperPAD, the user can 
launch any of the programs executable under DOS. 
The ‘run’ command in the script can execute any other 
programs stored in any directory on any disc. While 
the external program isrunning. HyperPAD shrinks to 
only 3K automatically, giving the program as much 
memory as possible. When you exit the external 
program, HyperPAD is reloaded again at once without 
any user’s interference and resumes where it left off. 
This feature provides the users with the possibility of 
integrating other applications, including other existing 
Hypertext applications, into a HyperPAD application. 


Execute a DOS command 

Some frequently-used DOS operations can be carried 
out within HyperPAD without exiting to DOS: Check 
a disc; Format a disc; Copy a disc; Backup files; 
Restore files; Set date and time, etc. 


20. 


Auxiliary utilities 
Some auxiliary utilities are provided by the HyperPAD. 

a. Cap: with Cap, you can capture images from 
nearly any DOS program that mins in character 
mode, such as Lotus 1-2-3, dBASE IV, 
WordPerfect and Microsoft Word. Then you 
can import the captured images into the 
HyperPAD. 

b. Compact: with Compact, you can remove all the 
free space in the pad files, which accumulates 
while you are modifying or creating these files. 

с. Strip: with Strip, you can remove the text portion 
of all of the scripts of a pad without revealing 
how the pad was created. 


Less resource requirements 

HyperPAD can be run under DOS 2.0 or later, on all 
types of IBM PCs (XT, AT, PS/2, Compaq or 100% 
compatible system) with any colour or monochrome 
display or adapter. 

While running HyperCard needs more than 1 MB 
memory, 448 KB is enough for HyperPAD. And a 
mouse is not a necessity, as you can use a keyboard 
without any restriction on the performance. 

HyperPad is perfect for creating interactive tutorials 
on any topic, in which graphics are not a significant 
requirement. 

The Department of Library and Information Studies 
at Loughborough University of Technology obtained 
HyperPAD (Version 2.0) in January 1991. It was 
installed on an Olivetti E250 (IBM PC/286 compatible 
with 1 MB memory and VGA display) and runs under 
MS-DOS 3.3. 


Objectives 

Itiscammon sense that technology should be designed 
with users’ needs and specific tasks in mind. Thus 
when you begin to design a hypertext application 
system, you should ask yourself ‘who are the potential 
users’ and ‘what sort of tasks should this system 
employ’. 

JANET interconnects the local computer networks 
in UK Research Councils, universities and most 
polytechnics. Nowaday increasing numbers of 
information services are becoming available via the 
JANET network. Most of the information services 
became operational recently, so, to the staff and 
students, these services are toonew to be used properly. 
A comprehensive users’ guide was badly needed. At 
the same time, the Masters course of Information 
Technology, which includes networked information 
services, needed aself-teaching materialto complement 
the shortage of face-to-face sessions. 

With this context in mind, the author devised the 
following as the objectives of a hypertext system: 


— easy to use. The users without computer 
backgrounds can use the hypertext system 
without any specific training 
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— easy to navigate. The users always know where 
they are and how to reach their destinations 
through various links 

— various options. The users can select what they 
would like to read and choose the ways of 
browsing whether they take the hypertext system 
as a users’ guide or as a self-teaching material 


Design principles 

With the above objectives in mind, the following 
design principles were adopted during the system 
design. 


Chunking 

The information presented in hypertext databases 
needs to be divided into many small ‘chunks’ that deal 
with only one topic. Every chunk works as a node in 
the database. The size of a chunk is relevant. If you 
have chunks which are too large, the readers will feel 
difficult to find what they want to read and will get 
bored by longer text. If you have too small chunks, the 
readers will navigate too much to arrive at their 
destinations and the large amount of links will make 
authoring much more complicated. 

The information for the user’s guide comes from a 
report entitled Information services onJANET written 
by the author. The original text is arranged in tree 
structure: Chapter ~ Section — Subsection — Topic. So 
during chunking, the original text was divided into 
chunks of proper size, based on the topics. Usually the 
size of chunks is about one page (24 lines on the 
screen). The maximum size may not be bigger than 50 
lines. 


Links 

Links are the paths that connect one node to another. 

Links are usually represented by words or phrases. But 

in HyperPAD, most of them are represented by icons. 
There are various types of links available for users 

in this Hypertext-based user’s guide. A reader may be 

able to select which link types are active: 


Internal links 

These kinds of links are the links created within the 
user’s guide. The nodes at both ends of the link are 
located within the one application of HyperPAD. 


a. Navigation links 

This kind of links help readers move around the user’s 
guide. They include: Go to next page; Go to previous 
page; Go back to previous link; Go to Home page; 
Quit, and many other icons. 


b. Cross-reference links 
This kind of link brings reference information notes, 
relevant topics, background information) to readers. 


4.2.2 External links 
This kind of link includes: 


a. links between different applications of 
HyperPAD p O 
a 
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b. links between HyperPAD applications and other 
hypertext applications 


The importance of external links is the possibility 
for integrating the existing hypertext application, no 
matter which hypertext shellware is used, into the 
current HyperPAD application. 

In the author’s project, another two hypertext 
applications, developed by using Hyperhelper and 
Black Magic, are integrated into the HyperPAD 
application. They provide the hypertext versions of a 
beginner’s guide for JANET users. 


Screen 

The basic unit of information for HyperPAD is Page. 
One page occupies a full screen. Hence the design of 
the screen is very important to the performance of the 
Guide. Forusability and user-friendliness, there should 
be atleast the navigation information shown below оп 
the screen. 

Disorientation in a hypertext database is acommon 
problem to the users of it, so navigation information is 
vital to an application of HyperPAD. This kind of 
information should always be at the hands of users, i.e. 
always presented on the screen. 


a. Position information: 

Disorientation stems from the fact that users do 
not have enough information about their current 
location relative to the overall structure of the 
database. As the database is arranged in a tree 
structure, i.e. Chapter — Section — Subsection — 
Topics, each screen is indicated by the 
information about the current Chapter, Section, 
Subsection and Topics. 


b. Moving-around buttons: 
To help users move from current screen to next 
destination as they like, a set of buttons is placed 
on each screen besides various menu/icons. 
They include the following buttons: Next page; 
Previous page; Previous menu; Previous link; 
Previous menu; Help information; Home; Quit. 


Browsing 

Browsing is the way users explore ahypertext database. 
It should be easy-to-use and be flexible. With little 
literacy of computer science and limited knowledge of 
the subject domain the users can have a surprising and 
satisfying freedom to ‘travel’ from one node to another 
looking for what they are interested in. 

To cope with the different needs of the users with 
various background, there are three different ways for 
browsing this HyperPAD-based database. The user 
can select one of them from a menu, 


Hierarchical menu-driven (vertical views) 
As we know, this hypertext database is organized in a 
hierarchical structure. So the natural way is to.browse 


it along with its hierarchies. A set of meus arrariged з. 


in tree structure will guide users to their destinations; 5 "^ 
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This way is suitable to the beginners and students, 
who use it as self-learning material. 


Contents-list 

The users browsing the database in this way can select 
a specific chapter, section, subsection, or topic from a 
contents list by highlighting the titles. It is just like 
reading a book, the only difference being that the users 
simply press the <return> key instead of turning the 
pages. 

In this way, users can get an overview of the structure 
of the whole database and select only the part they 
want to browse. It is suitable for users who would like 
to review what they have read. 


Facet list (horizontal views) 

Users can access the database from different facets 
instead of its hierarchical logical structure. They can 
select one of the specific features listed on the facet 
lists, which can be accessed through a menu, and jump 
directly to the screen on which the information they 
are interested in is presented. 

This way is suitable for the users who are searching 
for some specific information. 

Each way has its advantages and disadvantages, but 
they complement each other. Selection depends on 
user’s need, familiarity with the database or reading 
capacity, etc. 


Implementation issues 


Text conversion 

The hypertext version of the user’s guide is based on 
a report with the same title. The first step is to chunk 
the text of the report into many files of proper size. Due 
to the fact that it is difficult to do chunking 
automatically, the conversion was done manually by 
using Wordstar. Each chunk can be ‘imported’ into the 
proper field on the screen. 


Button manipulation 
Buttons are the symbol of the link. There are two kinds 
of buttons: permanent buttons and temporary buttons. 
Permanent buttons appear on every screen. The 
function of it is a fixed one, for example, the 
 —P button always activates the link to the next page. 
The temporary button is the context-related button 
which appeared on one or a few pages. 
Usually the permanent buttons are kept in a button 
library. Youcan recall them from the library whenever 
they are needed. 


Screen creation 
In HyperPAD, each screen consists of background 
and page. Background is a global variable within a 
pad, ie. when you modify the background of one 
screen, all the screen within the same pad will be 
changed accordingly if. they consist of the same 
background. Both background and page can be cutand 
pasted to other screens no matter where they are. 
For an efficient creation of screens, you had better 
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keep the used backgrounds and pages in a basic pattern 
library. Then you can recall, modify and combine 
them to form new screens at any time. 


Pad management | 

In an application, there may be dozens of pads. When 
you create links between different pads, you should 
jump from one pad to the others. Hence it is important 
to manage pads correctly. The author created a pad list 
for this purpose. On the list, the names of the pad are 
the buttons linking to them. From this list, you can 
select and jump to any pad of the application easily 
without the burden to remember all the pad names. 


Conclusion 

Hypertext is a new approach to information 
management in which textis presented as anetwork of 
nodes connected by links. Such nodes may be text, 
graphics, audio, video, and even other software. 

With hypertext technology, it is not possible to 
provide large amounts of information in the form of 
interactive documents and make is accessible to a 
wide population of users, no matter what their 
background knowledge is. 

Although hypertext provides a new way of 
organizing information, it also leaves a whole new set 
of problems for the designers to solve. As the volume 
of information grows, the task of authoring ahypertext 
database becomes much more complex. 

The author’s experience presented in this article is 
the first step to exploring the proper way to solve these 
probiems. 
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Abstract 


This paper introduces the philosophy of Quality Assurance and traces the development of the British Standard for 
Quality Systems — BS 5750. The key components of the Quality System are covered and there is a discussion on how 
to choose a Quality System which is most appropriate to the needs of the particular organization. A comprehensive 
guide (including flowcharts) is also given which addresses the nature and scope of tasks which must be undertaken 
in implementing a Quality System commensurate with the requirements ofa recognized international standard such 


as BS 5750. 


Introduction 

The concept of seeking a guarantee in return for goods 
oritems exchanged is not new. In fact, the well-known 
phrase ‘my word is my bond’, still in use today, is a 
form of guarantee or assurance that an agreement 
reached or an obligation undertaken will be honoured. 
Guarantees in respect of items purchased (in exchange 
for money) are usually not verbal agreements but take 
the form of signed receipts, which imply that items 
bought will be fit for the purpose for which they were 
advertised or intended and that someone is accountable 
if they fail to live up to those expectations. 

Of course not all items which we purchase come 
with a guarantee — we regularly buy food items for 
example from the local supermarket which come with 
no such guarantees. In this case we trust that the 
appropriate food manufacturer has been inspected by 
the relevant authority to a standard which ensures that 
we, as the consumers, do not suffer any ill effects as a 
result of consuming the items. On the other hand, 
electrical items where there is an element of safety 
involved usually come with a guarantee which not 
only assures the working life of the item but also states 
that it was manufactured to a specific standard. 


The evolution of national standards 

Naturally, there is no merit in claiming that an item 
meets a specific standard unless those standards are 
recognized by all as being adequate measures of 
quality. In response to the need for nationally 
recognized measures of quality and performance, so- 
called National Standards Organizations were 
established and charged with the responsibility of 
developing just such agreed national standards. There 
are in existence today many such national bodies and 
literally hundreds of products for which recognized 
standards have been set. 

The familiar ‘Kite Mark’ of the British Standards 
Institution was in fact introduced as a form of consumer 
protection signifying (to members of the public) that 
items have been manufactured to an appropriate British 
Standard. In essence the Kite Mark is a form of 
product certification. 


However consumers were not the only driving force 
behind the introduction of national standards. As early 
as Victorian times it was recognized that some form of 
standardization was required for the survival of British 
industry and industry ‘norms’ or standards were set 
for criteria such as for screw threads, pipe diameters 
and so on. Commensurate with the development of 
these standards came the concept of ‘inspection’ and 
specialist organizations such as the Institute of 
Engineering Inspectors (incorporated in 1922 and 
now renamed The Institute of Quality Assurance, 
were established to fulfil this role and check that 
products were being manufactured to agreed norms 
and standards. 


The rise of Quality Assurance Standards 

As the complexity of industrially manufactured 
products grew, so did the requirements for inspection 
and certification. The situation became particularly 
unwieldy with the passing of two world wars and with 
the increasing sophisticating of military and aerospace 
products. In the United States, in particular, where a 
large number of complex engineering projects were 
underway (with components being obtained from 
many different suppliers), there were problems of late 
supply, incompatibility and component failure. 

As a result, it was recognized that some form of 
overall management control and co-ordination was 
required and a number of Military Standards (MIL 
Standards) and in the UK subsequently Defence 
Standards (DEF STAN 05-21/1 — ‘Quality Control 
System Requirements for Industry’) were drawn up 
specifically for this purpose. These initial military 
specifications were in reality the forerunners of the 
modern Quality Assurance Standards. They were 
significant in that they enabled an important distinction 
to be made between the previous Quality Control/ 
inspection environment and anew concept for ensuring 
overall quality and control — Quality Assurance. 

It was not only in the military and aerospace 
industries that there was a recognition that the old 
inspection-based systems were no longer totally 
adequate. In Great Britain, the large nationalized 
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industries at that time (e.g. British Gas, British Rail, 
The Coal Board, British Steel etc.) were seeking a 
system whereby they could be assured up-front that any 
goods or services that they bought would be delivered 
on time, within budget and to a pre-defined level of 
Quality. As large consumers, these purchasers 
recognized that a system of Quality Assurance could 
provide them with the guarantee they needed (prior to 
committing funds) and they were to a considerable 
extent instrumental in the introduction of Quality 
Assurance to British industry. 


British Standard 5750 

Following the production of a Government White 
Paper on the subject, a British Standard for Quality 
Assurance (BS 5750 ‘Quality Systems’) was published 
in 1979. It contained a description of the controls 
which it prescribed were required to be instituted in 
order for a supplier to claim that it was a ‘Quality 
Assured’ Organization. In the same way as with the 
registration of products to a particular standard, an 
organization could not be accredited to BS 5750 
unless it had been inspected (and formally accredited) 
by an independent authority (such as the British 
Standards Institute) against the standard. 

In contrast to the Kite Mark (which is a method of 
product certification), BS 5750 is a form of company 
certification. The standard specifies all those ‘elements’ 
of the management system which are seen to be 
critical to the quality of the final product and describes 
how these elements are to be controlled. 

Although its initial adoption by industry was quite 
slow, anumber of organizations have now implemented 
Quality Systems commensurate with the requirements 
of BS 5750, although its predominance in the 
engineering sector remains. In fact, BS 5750 has been 
increasingly criticised for its continued focus on the 
engineering/manufacturing environments — which a 
quick glance at the index of the standard will show. In 
more recent years, in particular, a number of non- 
engineering and service sector orpanizations have 
recognized that the philosophy of Quality Assurance 
is, in fact, applicable to every organization and have 
sought a more broadly based guideline or standard. 

Inaneffortto accommodate the views of these other 
industry sectors, anumber of QAS (Quality Assurance 
Schedules) have been produced to augment/amplify 
the standard. Schedule no. 8 for example, is written 
specifically for the Service Sector industries and 
contains some additional requirements and guidance 
on the interpretation of the standard's requirements 
for these organizations. In 1987 the entire standard 
was revised and republished and its format was 
significantly amended. The text is now identical with 
that of its equivalent International and European 
Standards — ISO 9000 and EN 29000. 

BS 5750 is due for another major renewal in 1996 
and there is currently some interesting discussion 
underway as to its most desirable format and scope. It 
should be noted that the Nuclear Industry has for some 
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time had its own Quality System Standard. In the UK 
this is BS 5882 (Specification for Total Quality 
Assurance Programme for Nuclear Power Plants), 
Which is similar in philosophy to BS 5750. 


The philosophy of Quality Assurance 

There are a number of ways in which the concept or 
philosophy of Quality Assurance could be described. 
The term ‘Quality’ is defined in British Standard 4778 
(‘Quality Vocabulary’) as ‘The totality of features and 
characteristics of a product or service which bear upon 
its ability to satisfy a given need’. As we have seen 
‘Quality’ in todays terms is also synonymous with 
meeting customer needs and expectations, that is the 
concept of ‘fitness for purpose’. 

The term ‘Quality Assurance’ might then best be 
described as a philosophy or concept whereby all 
those activities and functions which have an impact on 
the quality of the final product are controlled and 
managed, i.e. “All those planned and systematic actions 
necessary to ensure thet a product or service will 
satisfy a given need” (BS 4778). To enable the concept 
of Quality Assurance to work in an organization 
therefore requires translating this philosophy into а 
practical system or framework of management control. 

In the first instance, thz level of quality that is to be 
achieved must be defined. In cases where products 
(such as manufactured items) have a British Standard 
specifying appropriate levels of quality, this may be a 
fairly straightforward exercise. In the case of a service- 
sector industry such as ‘consultancy’ or ‘health care’ 
it will be more difficult. Secondly, there must be a 
practical and achievable plan for attaining the desired 
level of Quality. This is sometimes referred to as the 
Quality Assurance Programme. Finally, there must be 
a system for ensuring the maintenance of that initial 
level of quality — a Quality Management System. The 
Quality Management System must be demonstrable 
(so that potential customers can see how the company 
plans to maintain the level of Quality) and therefore it 
must be documented and published. 

The term Quality Assurance is often confused with 
the term Quality Control. The two are actually quite 
different. In contrast to a Quality Control System, 
where the quality of the product is measured or 


- inspected as the production process progresses, a 


Quality Assurance Systern seeks to define in advance 
those management controls which must be applied 
and continually adjust these to ensure the adequacy of 
the System and thereby the fitness for purpose of the 
final product. 


Basic principles of Quality Assurance 

There are perhaps three basic principles of Quality 
Assurance thatunderline its overall philosophy . These 
are: 


1) Quality is everybody's business (each person 
has a specific quality-related responsibility). · 
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2) Get it right first time (on the assumption that 
‘prevention is better than cure’ — a properly 
designed Quality System will have identified 
and anticipated all the likely problems in 
advance). 

3) Communicate and plan (the real benefits of 
Quality Assurance come from operating in an 
organized and controlled environment ~ each 
person must be made aware of their specific 
quality related responsibilities and plan for these 
within their daily activities). 

These principles have been described in the past as 
no more than (common sense’. Any organization 
which is successfully providing a product which is fit 
for purpose and a hospitable working environment for 
its employees should be paying some attention to 
these issues. Equally, one would expect some sort of 
formalized controls in place over quality or (at least) 
safety critical activities. However, itis surprising how 
many organizations lack such a formalized structure 
of contro]. Often, in large organizations, it is not so 
much the fact that no one is paying attention to such 
issues, but more that everyone is doing it in a different 
way and the problem becomes one of overall co- 
ordination and integration. 

Inconsidering the scope of ‘critical activities’ which 
should be addressed by the Quality System, there is a 
lot ofroom for confusion. An organization may choose 
only to control those activities pre-prescribed by a 
recognized standard (such as BS 5750). However, if 
the organization is in a service sector industry, there 
are few (or no) such guidelines to follow, what then? 
What is scope and breadth of the critical activities 
upon which the organization should focus? At 
this point it is wise to consider the true purpose or 
motivation for introducing the philosophy of Quality 
Assurance to the organization. 

From a purely market-driven standpoint, one could 
argue that the organization should only seek to control 
those activities which are dictated by the relevant 
Quality System Standard. However, taking this route 
will mean that not a// activities within the organization 
will be addressed, for example, *marketing' which 
many would agree is an essential or critical activity, is 
notcovered within BS 5750. Therefore a decision may 
be taken to use the standard as the guiding force but 
includeanumberofother (company dictated) activities. 
By introducing the requirements of the standard alone 
there will of course be a number of benefits — not least 
of which is registration, giving the company a 
significant marketing edge over its 'non-assured' 
competitors. There will also be a number of ‘spin-off’ 
benefits from introducing BS 5750, for example, 
various anomalies or gaps in the management system 
will be identified and areas where significant 
improvements can be made will be highlighted. 

However, it has been argued that a truly integrated 
Quality Management System should embody a wider 
brief than those aspects dictated by a specific standard 
alone. In this context, the exponents of TQM (Total 
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Quality Management) are leading the way. They 
advocate that in order to realize the whole scope of 
improvements which may result from the introduction 
of Quality Assurance, it is vital to consider the whole 
organization and not just those processes dictated by 
an external standard. Re-engineering of processes is a 
frequent outcome of TQM and this philosophy is 
already gaining a significant reputation for not only 
improving the quality of processes but reducing the 
unnecessary costs of inefficient practices. 

In the final analysis, it is entirely at the discretion of 
the organization itself to choose which route to-take. 
The former may not involve such a large initial capital 
outlay, but the rewards may be limited when compared 
to the second TQM-type approach.! 


‘Products’ of the Quality System — Quality Manuals 
and procedures 

The Quality Assurance System must be documented 
(and published) in order to achieve its objectives. 
Quality Systems Documentation must be distributed 
internally to staff (so that they are aware of their 
responsibilities) and certain items (i.e. The Quality 
Manual) are additionally distributed externally to 
customers. The Quality Manual is an important 
document and is one of the first items which an auditor 
or assessor will require if the organization is to be 
formally accredited (i.e. audited against a recognized 
Quality System standard). 

The Quality Manual The Quality Manual is a 
documented statement or description of the 
organization’s Quality Programme and is officially 
defined in BS 4778 as ‘A document setting out the 
general quality policies, procedures and practices of 
an organization’. The contents of the Quality Manual 
will typically include: 


— A signed policy statement from the Company 
Chairman or Managing Director outlining the 
organization’s commitment to Quality 

~ A list of responsibilities with respect to 
maintaining the Quality Programme 

~ A description of the mechanism in place to 
control those activities which potentially impact 
the quality of the final product. 


The ‘mechanism’ forcontrolling the critical activities 
is an interesting point of discussion. It may take a 
number of different forms but the most usual is a series 
of written procedures. 

Procedures The usual way of controlling activities 
within an organization is through a system of formalized 
procedures. Procedures are the documented controls 
over critical activities and describe how an activity is 
to be performed and by whom. They should be written 
according to a standard format, but there are a number 
of different ways in which they may be published. The 
particular style chosen is at the discretion of the 
company butmust obviously be geared to the particular 
‘culture’ of the organization. For example, a number 
of companies chose to include flowcharts (of various 
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types) within their procedures as these can be a useful 
way of drawing attention to the important activities 
while keeping the number of textual pages to a 
minimum. 

Procedures are not the only way of controlling 
activities: training courses, company notices and signs, 
on-the-job training and supervision and, of course, 
auditing, are just some of the other methods of 
instituting control. Computer systems can even be 
thought of as a form of automated procedural control 
although it is unwise to entirely substitute the normal 
(manual) methods of control with these systems. 
Computers should be thought of as ‘tools’ — they are 
useful ways of enhancing various processes but are 
nota substitute for manual control or adequate training. 


Auditing the Quality System 

One of the most important ways of maintaining control 
of the Quality System is through the process of audit, 
the purpose and value of which can be summarized as 
follows: 


1) Auditing provides objective evidence of the 
effectiveness of the Quality System. 

2) Deficiencies and deviations in the Quality 
_ System (and its component parts e.g. procedures) 
can be identified and addressed before they 
become significant. 

3) The audit process provides an opportunity to 
review the effectiveness of the current controls. 

4) Auditing ensures the maintenance of any internal 
or external standards to which products and 
services are being delivered. 


Formal Quality System documents (i.e. procedures) 
are fundamental to the auditing process because it is 
against the processes described in these documents 
that the Quality System will be audited. Any 
discrepancies will be highlighted and documented in 
the form of ‘Corrective Actions’ — which must be 
‘closed out’ prior to the next audit being undertaken. 

The introduction of a programme of internal audits 
is a requirement of BS 5750 and is fundamental to the 
philosophy of Quality Assurance since itis the method 
of ensuring that the Quality System is continuing to 
meetits objectives. Itis also a requirement that auditors 
(personnel with stated auditing responsibilities) be 
specifically appointed by the organization and trained 
so that they are suitably qualified to carry out their 
task. Training should be carried out in compliance 
with a recognized national scheme such at that 
administered by the IQA — The Registered Assessor 
Scheme. : 


Preparing for the implementation of BS 5750 
Before implementing BS 5750, there are a number of 
issues to consider. These will influence the nature and 
scope of the Quality System and include such issues 
as: 
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— What is the purpose of implementing a Quality 
Management System (e.g. to gain competitive 
edge, to cut the costs of inefficient practices, to 
tighten-up management control and so on)? 

— What standard is to be used to define the scope 
of the Quality System and are there any 
accompanying schedules which need to be 
considered? 

— ]s the Quality Standard to be used for internal 
guidance only oris ita stated requirement of our 
customers. 

— How will the Quality System be maintained, 
what scale of resources are to be apportioned to 
it? 


Once these issues have been addressed, the actual 
process of implementing the Quality System can begin. 
The following section of this paper considers briefly 
some of the main activities (referred to as ‘tasks’) 
which must be undertaken. 


BS 5750 — A guide to implementation 

It is impossible within the scope of this paper to 
describe in detail every potential task or activity which 
may be required in implementing a Quality System. 
Naturally the precise sequence and scale of these tasks 
differs according to the particular organization. The 
main tasks however are described below and depicted 
in flowchart form in Figures 1, 2 and 3 which should 
be referenced by the reader in conjunction with the 
text. 

Figure 1 (Quality Assurance Programme 
Implementation — An Overview) gives a complete 
overview of the major activities which are required in 
the implementation of the Quality Assurance 
Programme. It assumes that some form of formal 
accreditation is being sought (i.e. toa National Quality 
System Standard such as BS 5750). Each major task 
on the flowchart has been numbered and for ease of 
reference these numbers are referred to within the task 
headings. 


Task I — Defining the scope of the Quality 
Programme and preparing a Costed Action Plan 

A detailed breakdown of the activities which require 
to be undertaken within the scope of this task are given 
in Figure 2 (Defining the Scope of the Quality System 
and Preparing a Costed Action Plan). The 
implementation of a Quality System costs money and 
in business terms, it is essential tomake some estimate 
of the scale of resources which will be required to 
achieve this so that management can commit and plan 
for the necessary funds and resources. 

The definition of what constitutes a ‘critical activity’ 
has already been discussed.” Taking the ‘simplest’ case 
(i.e. that the organization wishes only to address those 
activities as dictated by the relevant standard) the 
requirement headings within the standard will still 
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Figure 1: Quality Assurance Programme 
Implementation: An Overview 


Define the Scope of the Quality Programme 
and Prepare a Custed Implementation Plan 
including Resources 


0 


Obtain Management Approval 
3.0 
Commence Implementation 
of the Quality Programme 
Prepare an ITT for 
Certifying Authorities (CAs) 


5.0 


Select CA and book an Assessment 


4.0 


2 
6.0 
Conduct ‘Dress Rehearsal’ (pre-assessment) 


audit and ensure any corrective actions are 
closed out 


0 


7 
Undergo formal assessment by СА 

8.0 

9.0 


Mount publicity Campaign 





need to be interpreted into the activities actually 
undertaken by the organization. Typically the activities 
undertaken by the company do not ‘sub-divide’ into 
the exact divisions (as specified by the requirement 
headings in the standard) and some form of cross- 
reference matrix may need to be drawn up. 

When it has been established where activities 
representative of the standard’s requirements are 
occurring within the organization (and who is 
responsible), it is necessary to conduct a structured 
review of them. This is almost a pre-implementation 
audit, the purpose of which is to establish just how 
adequately these activities are currently being 
controlled. The judgement will, to a certain extent, be 
subjective and will also depend on the experience (and 
even the staff position) of the reviewer. The review 
should address such questions as: 
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Figure 2: Defining the Scope of the Quality 
System and Preparing a Costed Action Plan 


Agree the scope of 
the Quality System 


Define the 
critical activities 


Conduct a structured 
review of critical activities 
against the required 
standard and scope 


Define where current 
controls are inadequate 
and need to be improved 


Prepare a prioritized list 


of the procedures which 
need to be written and the 

most suitable form and 
content for the procedures 


Assess the resources 
necessary to improve work 
practices and produce the 
necessary procedures 


Prepare a costed 
schedule and specify 
any external consulting 
services required or grants 
which may be available 


Prepare the-costed 
Action Plan 





— Arethereany documented controlson the activity 
in question, what form do they take and are they 
adequate? 

— Arethe personnel responsible for this activity in 
possession of the relevant instructions and do 
they fully understand their responsibilities? 
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Figure 3: Implementing the Quality System — Typical Activities 


3.1 


Define organization and 
quality-related responsibilities 
Write Job 
Descriptions 







3.2 


Establish action team 


Educate action team 
in purpose and objectives 


Write the 
Quality Manual 


Assign responsibilities 
for the preparation 


of agreed procedures 






Train action team 
in procedure writing 


w 
T 


Prepare draft procedures 


3.5 
Review, amend and 
agree procedures 
3.6 


Educate workforce 


Train Auditor(s) 


3.8 


Implement corrective actions 


28 Aslib Proceedings, vol. 45, no. 1 


— Are there any problems associated with 
performing this activity and of what nature are 
they (ad-hoc or recurring)? 


and so on... 


Eventually a picture will emerge of where current 
controls on processes are inadequate and need to be 
improved. An estimate must then be made of the 
resources necessary to correct them. In some cases 
new or improved procedures may be necessary, the 
lack of adequate training may be a problem or there 
may be wider and more subtle issues to consider such 
as staff motivation and morale. 

A checklist of necessary improvements will need to 
be drawn-up and a decision made as to how the 
necessary resources are to be provided. Questions 
which will have an impact on the budget for the 
Programme include: 


— Do we have adequate expertise in-house to 
correct the situation (e.g. to write the necessary 
procedures)? 

— How will the exercise affect our production/ 
cash flow? 

— Arethere any grants available which might help 
and what are the rules of applying for these? 
(e.g. the DTI Enterprise Scheme) 

— What form of external (consultancy) help might 
be appropriate? 

— Should this external help be provided in the 
form of ‘training’ of ‘tasks’? 


Finally, some form of costed action plan will need 
to be prepared for submission to management. 


Task 2 - Obtain management approvallcommitment 
of senior staff 

Strong leadership and the commitment of management 
personnel are essential if the new Quality System is to 
succeed. Management approval is usually ‘crystalised’ 
in the production ofan official Quality Policy Statement 
(signed by the Chairman or Senior Executive) which 
describes the intention of the organization to adopt a 
Quality Assurance Programme and commit the 
necessary funds. A signed policy statement of this 
type is a necessary inclusion in the company’s Quality 
Manual. 


Task 3 — Implement the Quality Programme 

A detailed breakdown of activities to be undertaken in 
implementing the Quality Programme is given in 
Figure 3 (implementing the Quality programme). The 
production of a statement of responsibilities and the 
appointment of an Action Team to lead the initiative 
are the first important tasks. 

The statement of responsibilities should indicate 
those personnel who are specifically responsible for 
the assurance of quality within the organization. There 
may be a dedicated Quality Manager, Lead Auditor, 
Assistant Auditors, Quality Control personnel and so 
on. In a small organization, these responsibilities may 
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not translate to dedicated individuals, but at least the 
quality managementand auditing responsibilities must 
be nominated to specific personnel and specified 
within specific Position or Job Descriptions and shown 
on accompanying organization charts. 

The Action Team may comprise those individuals 
actually charged with developing new procedures, 
writing job descriptions etc., or may be made up of 
personnel of a senior nature who have a steering 
committee function andare responsible foroverseeing 
the status of the programme overall. The steering 
committee can perform a useful role by adjudicating 
on the scope and format of the necessary procedures 
which are to be developed. 

Procedures should be prepared in a standard format 
and they must additionally be authorized by an 
appropriate signatory. The first issue of a procedure 
(and their ongoing amendment and distribution) must 
be strictly controlled. The Action Team will need to 
make a ruling (if there is no precedent from existing 
documents) on the format and content of such 
procedures and the authorization process to be adopted. 

Once the necessary procedures have been drafted, 
they must be issued to the staff and after allowing for 
a period to settle in, activities must be audited against 
them. A series of internal audits must be conducted, 
corrective actions identified and closed out. Throughout 
the implementation process, training is an important 
activity and may be necessary in the following topics: 


— Quality Assurance — General Introduction 

— Procedure Writing 

— Auditing 

Once there is a high degree of confidence that the 
relevant controls are in place, it is then necessary to 
consider which Certifying Authority to approach for 
formal accreditation — see Figure 1. 


Tasks 4.0 and 5.0 — Selection of the certifying 
authority 

There are a number of bodies who are certified to 
accredit organizations to BS 5750 and any relevant 
accompanying schedules. Organizations such as the 
British Standards Institute and Lloyds Register are 


' accredited to assess companies in a wide range of 


different industry sectors, while other accreditation 
bodies have a more limited remit. The process of 
formal accreditation is quite involved and can be 
costly — depending upon the size and nature of the 
organization. It is usually a good policy to prepare an 
Invitation to Tender (ITT) for accreditation services. 
A firm estimate of the costs of the accreditation 
process should be gained and it is also useful to meet 
the various certification bodies to gain an insight into 
their methods. 

After the initial assessment, the process of 
accreditation is, in fact, ongoing and a series of annual 
or biannual visits will be conducted as long as the 
company wishes to retain its accredited status. The 
scope of each individual assessment may be slightly 
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different. In large organizations, it is usual for the 
certifying authority to focus on a sample of activities 
and rotate their focus of attention over the years. 
Certificates are issued at each assessment and provide 
‘accreditiation cover’ for the next immediate period. 


Task 6.0 — Conduct programme of internal audits 
and final 'dress rehearsal 

Once the certifying authority has been chosen, a date 
will be given for formal assessment. One of the first 
items that the Certifying Authority will ask for is the 
Quality Manual and a list (or even copies) of the 
organization's procedures. It is a requirement of the 
certifying authorities that the organization has had a 
Quality System implemented for a minimum of two 
years before a formal assessment can be undertaken 
and accreditation granted. Naturally, one of the 
principal items of evidence of this fact is the record of 
all the internal audits undertaken and any corrective 
actions closed out. 

Not surprisingly, it is one of the requirements of BS 
5750 that there be a formal method (i.e. procedure) for 
carrying out such internal audits. Most organizations 
have supplemented this with a procedure to be followed 
in respect of external assessment or audit. It is usual to 
put this to the test in a ‘dress rehearsal’ so that any 
potential problems can be ironed out before the official 
accreditation takes place. 


Tasks 7.0 and 8.0 — Undergo formal assessment and 
obtain accreditation 

The formal accreditation process may last 2 to 5 days 
and depends upon the size and complexity of the 
organization seeking accreditation. If accreditation is 
awarded a suitable certificate will be prepared and 
issued. The certificate will clearly state to which part 
of BS 5750 the organization has been accredited (i.e. 
Part 1, 2 or 3). The organization may obtain 
accreditation with conditions, in which case the 
certifying authority will need to revisit to ensure that 
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these conditions or recommendations for amendment 
have been implemented or closed-out before 
unconditional accreditation is granted. 


Task 9.0 — Publicize and communicate 

When a company has achieved accreditation to a 
recognized standard, it is usual to publicize the fact. 
Where a particularly innovative Quality System has 
been designed or accreditation gained in a new sector 
of industry it may also be appropriate to apply for a 
British Quality Award. In a similar manner to the 
product Kite Mark (see earlier in this paper), there is 
a similar recognized symbol for accreditation to BS 
5750. It could be thought of as a ‘kite mark’ for the 
company's management (or Quality) system and 
conveys a similar degree of confidence or 
understanding to potential customers. 


1 It should be noted that BS 5750 comprises 6 
individual parts, with 2 Introductory sections. The 6 
principal sections comprise 3 ‘pairs’ of specific 
requirements and accompanying 'guidance notes'. 
The3 specific requirements are similar, but are geared 
towards the differing complexities of activities that 
may be undertaken by a company. Part 1 is applicable 
to the most complex type of organization and includes 
requirements for those companies engaged in ‘design’ 
activities, for example. Part 3 is targetted at 
organizations who do not have a design function and 
therefore the standards requirements in this respect are 
not applicable. 


2 Note the remarks made in note number 1. The 
decision as to which Quality System Standard is to be 
used includes deciding which Part of that Standard is 
applicable. 


Linda Wedlake can be contacted at Information 
Control Ltd, Phoenix House, High Road, Benfleet, 
Essex, 557 SHZ. 
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Business information in Hungary 
Impressions of the contemporary scene 


Kaye Towlson 
Information In Business, De Montfort University 


Hungary is one of the fastest moving central european 
countries in the shift towards the market economy; 
with a target of 50% privatization by the year 19941, 
Part of this rapid change is a growing and substantial 
interest throughout the Hungarian information world 
in business information; particularly that available.in 
the West. There is an acceptance within the information 
profession of the need to develop a business 
information infrastructure to support the needs of the 
growing community of private SME’s (small to 
medium sized enterprises) in the move towards a 
market economy. This article will consider this 
development, taking account of existing Hungarian 
sources and barriers to development. 

The enthusiasm and commitment of Hungarian 
information workers to the development of a national 
business information network is expressed in all sectors 
ofthe profession. This has become apparent in contacts 
with Hungary overthe past years. For example, through 
work aimed at raising awareness of business 
information and its national development, a keenness 
to develop the existing library network to support the 
growing body of Hungarian enterprise has clearly 
emerged. Much of this work has been conducted 
under the auspices of the co-operative agreement 
betweenthe UK Library Association and the Hungarian 
Library Association, now due for renewal. 

Several information services for business already 
exist as part of the established library network. For 
example, the computerized services team of OMK, 


the Hungarian Central National Technical Library in | 


Budapest offer online searching facilities to companies, 
government organizations and research institutions 
all over Hungary. These services have been available 
on a commercial basis for the last ten years. During 
this time OMK have built-up considerable experience 
in online searching helping them to link their name 
with online searching throughout Hungary. This link, 
coupled with resulting good relations with online 
vendors has enabled OMK to take an active role in 
online training in Hungary, both as a trainer and a 
venue’, 

The University of Economics in Budapest has 
developed its own database ‘Econinfo’. This contains 
bibliographic and subject descriptors ofall documents, 
of national and international origins, added to library 
stock since 1990. Some entries include abstracts, 
annotations and key facts and data, particularly those 
defined as business information. The database may be 
searched by standard bibliographic data eg: author, 


title, publisher or by thematic terms, ie. keywords 
including company and individual names. The 
keyword system operates in three languages: 
Hungarian, English & German. The University library 
offers remote access to the database on a commercial 
basis; a service to which a significant number of 
Hungarian companies and organizations subscribe. 

Other libraries in the country, offer similar online 
services, for example, the University of Veszprem, 
offers online search facilities to local organisations on 
a fee basis. Like OMK, this service has been in 
operation for 10 years. The service has its roots in the 
provision of chemical and chemical engineering 
information, and this topic remains the main focus of 
the service. Plans to improve the service include the 
broadening of subject coverage to include business, 
linguistics and education’. 

Another major supplier of business information is 
Kopint Datorg, a marketing and computing limited 
company formed from the former governmental bodies 
of the Institute of Economic and Market Research, 
and the Computer Bureau. The company generates 
and publishes economic forecasts and bespoke market 
research. Thus Kopint Datorg hold a rich source of 
business information including trade statistics, market 
research and economic indicators. The company takes 
an active and innovative role in the information 
community, an example of which is Kopint Datorg’s 
latest development of an inter-library co-operation 
group in response to the cessation of government 
subsidies to certain libraries. Organized by Kopint 
Datorg, these libraries have joined together to form a 
club. Monthly meetings offer a forum for the exchange 
of knowledge and experiences, the club acts as a 
vehicle to aid professional current awareness, it offers 
discounts on certain training events and significant 
discounts from library suppliers have been negotiated 
for club members’. 

However, the above institutions remain the 
exception rather than the rule. There are no 
comprehensive business information services, in the 
sense of UK services like Information In Business at 
De Montfort University, Warwick Business 
Information Service or the London Business School’s 
business information service. 

Despite the fact that fully comprehensive business 
information services are yet to develop, there exists 
quite a range of domestic business information sources, 
from directories to business journals and statistics’, 
However, domestic business information is not without 
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problems and does require much development. This, 
hopefully, will occur in parallel with the development 
of the market economy. Two main gaps exist in the 
provision of domestic business information and they 
occur in two key areas: company information and 
market information. 

Company information exists on a basic level, ie. 
in terms of contact details for producers and suppliers. 
There are several Hungarian directories available 
which supply this type of information, and the Ministry 
of Trade maintains company databases giving full 
contact details plus the legal status of the company 
and, where available, the number of employees. 

In general it is difficult to find out more about a 
company. It is not a general policy for libraries to 
maintain collections of company annual reports or 
trade literature, mostly because this information is 
not readily available. The company information 
gap occurs mainly in financial data, due to the lack 
of a central or regional filing point for company 
financial information. Consequently domestic 
channels of company information would benefit 


from the development of more rigid rules and . 


mechanisms for the filing of company accounts] As 
the Hungarian stock exchange develops more 
company information will be available, a fact 
supported by the recent appearance of Extel cards 
for Hungarian quoted companies. Even so, there 
remains a large body of companies for whom 
financial data is not available. 

This lack of a central infrastructure for filing 
accounts results in the absence of credit reports and 
general financial company data; a lack of essential 
information on which to base sound commercial 
decisions. However, if Hungary becomes an associate 
member of the EC, as expected, it may be subject to 
EC regulations for accounting methods and filing 
requirements; this would be of great benefit to the 
Hungarian information world and, in turn, the 
Hungarian business community. 

Anongoing advance inthe provision of Hungarian 
company information is the database currently under 
developmentat the Hungarian Chamber of Commerce. 
The system offers a facility for matching the demand 
and supply of goods on both domestic and foreign 
markets by *marrying' sellers and buyers. This has the 
added advantage of increasing awareness and access 
of foreign investors and customers to the services of 
Hungarian companies. Once this project is realized, 
the Chamber plan to make their system available to 
Hungarian enterprises nationwide’. 

Consumer marketing information is equally as 
scarce, Although the Institute of Market Research in 
Budapest produces 100 data survey per annum and 
maintains regular contact with 60,000 people, and 
despite the services of Kopint Datorg, there is a 
general dearth of Hungarian consumer marketing 
information. The need for, and hopefully the 
availability of, consumer market information will 
grow along with the market economy. 
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Statistics are produced by the Hungarian Central 
Statistical Office. The office produces many 
publications, though sadly, for us, only one of them is 
in English. Unfortunately, on the international market 
the quality and validity of these statistics is tainted by 
the previously poor reputation of Central and Eastern 
Block official statistics. 

The commitment of the Hungarian information 
profession to the development of business information 
services has many implications for the country’s 
development of information sources and the training 
of information workers. A general attitude evident 
in Hungary, and commented on by many Hungarians, 
is one of running before you can walk. The Hungarian 
information profession is aware of the new 
technologies like online services and CD-ROMs and 
the fact that there is a wealth of business oriented 
sources available via these media in Western Europe. 
Expectations are raised and the information profession 
would like to provide this type of service tomorrow, 
but as with all development, a step by step approach 
is essential. 

The development of a business information 
network in Hungary, particularly aiming at the needs 
of budding SME's will be a gradual process. There 
are two key elements to this process: 

1 The development of sources (domestic and 

international) and services 

2 The training of personnel to staff the services 

and realize the full potential of sources. 

As stated above, the Hungarian domestic business: 
information market requires much development, both. 
in terms of available sources and services; especially 
if it is to draw. a par with such services in the West. 
However, a firm foundation exists in the national 
published sources and the Hungarian library network 
coupled with economic developments like the 
Hungarian Stock Exchange. Many information 
products covering Central and Eastern Europe have 
appeared in the West, but it would be ideal if Hungary 
could grow. its own domestic company/business 
information sources alongside the emerging market 
economy. 

The second key element to the development of a 
business information network in Hungary is that of 
training; training for existing and would-be 
information professionals plus training forthe business 
community. In our own business community in the 
UK there are many people who do not fully understand 
the true value of information. They do not appreciate 
that it is an essential element for all quality business 
decisions, Furthermore, they find it hard to grasp that 
this information does not come free, but is sold at a 
price reflecting its true commercial value. Thus, we in 
the information brokerage business have to make 
great efforts to foster this comprehension of these 
basic facts. The task of converting a business 
community previously used to operating in the fixed 
economy of communism will demand a major and 
continuous campaign. 
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Hungary needs to grow a new generation of 
information professionals trained in all aspects of 
exploiting business information sources, in service 
delivery, marketing, management and customer care. 
It must train those who are already practising 
professionals, in these matters, enabling them. to 
develop and provide key business information services 
for growing businesses. 

The text above indicates the need for much 
development, and it is important to highlight the 
existence of potential barriers to development, of 
which there are several. A major barrier is that of 
finance, obviously the development of sources and 
staff will cost a lot of Hungarian forints and cannot be 
achieved overnight. However, much work has been 
accomplished in this area with aid from external 
agencies like the United Nations and the British 
Council. Other potential sources of finance or 
sponsorship are the World Bank, the Know-How fund 
and relevant EC grants. 

Many Hungarians perceive language as a barrier 
to their entry and knowledge of international markets. 
In Hungary, German appears to be the most popular 
second language with English taking third place, 
although personal experience suggests that English 
speakers are not thin on the ground within the 
information profession. Furthermore, a general opinion 
expressed by information workers is that English is 
gaining in popularity as a second language, particularly 
amongst the younger generation. Therefore, although 
language may be a barrier to the effective use of 
international information sources, it is not a hurdle to 
the development of Hungarian business information 
sources. Moreover, by recognizing the need to 
encourage at least a working understanding of the 
English language, the profession are taking steps to 
overcome this obstacle. 

A major hurdle to development is that of human 
resources. Not only is there a need for substantial 
training, there is an implicit requirement for a change 
in attitude and behaviour; change being a force to 
which most human beings are very resistant. New 
attitudes and concepts of service and customer care 
would have to be taken on board along with the notion 
of working to specific deadlines. Perhaps most 
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frightening, and indeed alien, to Hungarians emerging 
from the communist regime, is the concept of personal 
responsibility. This is a problem acknowledged by 
many Hungarians. For so long the Hungarian people 
lived and worked in an environment where decisions, 
responsibility and blame were taken centrally, thus 
the concept of individual decision making and 
responsibility is almost an unknown concept. This has 
further repercussions in that people are not used to 
taking the initiative. Hopefully both personal 
responsibility and initiative will come with time. 
Despite the above hurdles the current mood in the 
information profession is one of guarded optimism 
and a relative openess to new ideas and concepts. 
There is an identified need within Hungarian 
information circles for the development of a business 
information infra-structure to help the transition to 
the market economy. It now remains for those in 
information to convince those in business and the 
government to help them to achieve this aim. 
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A study of journals needed to support the Project 2000 
nursing course with an evaluation of citation counting 
as a method of journal selection 


Paul Moorbath 
Senior librarian, St. Bartholomew's College of Nursing and Midwifery, London EC1A 7BE 


Abstract 

The nursing journals titles needed to support Project 2000 were considered. In order to reflect the structure of 
nursing literature as a whole, a citation count from the Citation index for 1990 was undertaken. In order to rank 
Journals in each of the 4 branches of Project 2000, an analysis of the citations in a leading journal representing each 

branch was undertaken. To reflect student usage a survey of photocopier use and citation in student bibliographies 
was undertaken. In order to reflect what titles the library ought to have, a questionnaire survey of tutors was 

undertaken. The ranking of titles in the Citation index was tested for correlation with the ranks obtained from student 
use and tutor recommendation and the correlation between student use and tutor recommendation was drawn. 

Finally, a scheme for combining the rankings of journal titles obtained by the methods above was devised in order 


to produce an overall ranking of the principal titles. 


Introduction 

There have been no studies on which journals could 
berecommended for a Project 2000 nursing college to 
take. The idea that the number of citations to a journal 
title in the literature indicated its importance was first 
proposed by Gross and Gross! in 1927 but the idea has 
been followed up by others eg. Garfield?. Gregory? 
used citation analysis of bibliographies on medical 
subjects to rank the importance of medical journals. 
Others eg. Sengupta et al have taken this a stage 
further by undertaking such a study for the purpose of 
determining which journals to purchase on a limited 
budget. There have not been any such studies in 
nursing libraries. 

Doubt has been cast on citation analysis as a 
means ofjournal selection forcollege libraries. Scales 
found no correlation between the ranking of journals 
obtained by citation analysis and the rank of journals 
based on analysis of loans from the British Library. 
Dhawan‘, Downes’, and Stankus and Rice? agreed 
with the scepticism of Scales as did Line and Par? 
who concluded that the only useful guide to journal 
value would be a local survey. 

Surveys on the use of journals by the placing of 
paper slips for readers to mark in the journals have 
been undertaken by Campbell and Langlois’. 
However this method is prone to fraudulent marking 
and non-response, and does not reflect the extent to 
which an issue is used. 

Bibliographies at the ends of doctoral theses 
and undergraduate assignments have been examined 
for journal use by McCain and Bobick", Chambers 
and Healy", Hardesty and Oltmann", St. Clair and 
Маргі! апі Lewis!5. Kriz!” having conducted such 
a survey used the information to cancel journal 
subscriptions. In nursing the only survey has been 
done by Gay!* on PhD theses in nursing, but 


this was an American study and was not of basic 
students, so is of limited value in the British context. 

The use of citation counts or reader usage does not 
give an indication of what publications students ought to 
read; for this it is necessary to consult tutors. Trainor”, 
Walter”, and Grefsheim? enlisted the help of lecturers 
or faculty members to select journal titles for cancellation. 

There have been few studies to compare lecturers’ 
rankings of journals with rank by citation counting. 
McAllister et a? and Gordon? found significant 
correlations between academics’ evaluations of journal 
rank and the rank obtained by citation analysis. This 
seems to conflict with Scales”. 

· Hafner ranked journal titles retrieved under a 
variety of medical subject headings (MeSH) but many 
were foreign language titles and not useful in a British 
nursing library. 

Although South-west Thames” produced a check 
list of journals for nursing libraries, it was not aimed at 
Project 2000 and lacked some nursing journal titles. 

The objective of the present study is to rank 
journals, a) citation count from the Citation Index, 
b) citation count from journals representing each of 
the four branches of Project 2000, c) student survey, 
d) tutor recommendation. The rank of titles obtained 
from the Citation Index is tested for correlation with 
the rank from student use and tutor recommendation 
and the rank from student use is tested for correlation 
with tutor recommendation. Finally, a scheme 
combining the scores for rariking of journals by the 
four methods above is presented to produce an overall 
list of core nursing journals. 


Method 

a) Citation analysis The occurrence of citations to 
papers and hence journal titles was obtained by 
examination of the citations listed in the Citation 
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Index in the International Nursing Index 1990 issue 3 
i.e. the last quarterly issue published. 

The count included citations to journals and 
from journals listed in Nursing Bibliography which 
reflects the holdings ofthe Royal College of Nursing 
library and hence British nursing practice. In 
addition it was decided to extend the titles counted 
to include the following publications, not held at 
the RCN but which seem of international interest. 
These are Seminars in Oncology Nursing, Journal 
of Professional Nursing, Critical Care Nurse, 
Journal of Nursing Quality Assurance, and 
Dimensions in Critical Care Nursing. It was decided 
to exclude state magazines such as Mississippi RN 
etc.; many of these are not even available from the 
British Library. 

It was decided to include the following medical 
journals, held at the RCN. These were BMJ, Lancet, 
Journal of the Royal Society of Health, and New 
England Journal of Medicine. 

Only citations to and in the publications being 

considered were included. 
b) Citation analysis in branches АП Project 
2000 students follow a common core course for 
the first eighteen months and then they elect to 
study one of the branch programmes in adult 
nursing, paediatric nursing, mental handicap, or 
psychiatric nursing. 

Citations in a journal to represent each of these 
four branches were counted for a year. Where 
possible a British journal was used to reflect British 
practice. 

To represent the adult branch, the citations in 

the Journal of Advanced Nursing for the whole of 
1990 were counted. The top 40 titles cited were 
listed. The figures for the lower ranked titles were 
small and prone to distortion by selfor idiosyncratic 
citation. Citations in the paediatric branch were 
determined by examination of all citations in 
Paediatric nursing for 1990. For mental handicap, 
all citations in the British Journal of Mental 
Subnormality for 1989 were counted as one of the 
1990 issues was devoted to research in Israel. The 
selection ofa suitable journal in psychiatric nursing 
was a problem as there is no British publication. 
The main primary journal is Archives of Psychiatric 
Nursing but a preliminary check showed that the 
journals cited were almost entirely American and 
were to medical and psychiatrists’ publications. 
The only other option was the main secondary 
journal, the Journal of Psychosocial Nursing which 
is also American. 
с) Studentuse Intwocolleges ofnursing, students 
were asked to fill in a sheet placed by the self- 
service photocopier asking them to fill in the names 
of journals used and their set number if they copied 
articles. The surveys were conducted at both sites 
for two months and were not continued longer to 
avoid reader fatigue. From the set numbers it was 
possible to extract the Project 2000 readers. 
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The bibliographies at the ends of student 
submissions on a review of research in any arca 
of interest were examined after permission had 
been granted. 

d) Correlation of student use and citation in the 
Citation Index The ranking of journals obtained 
from a citation count in the Citation Index was 
plotted against the rank of titles used by students 
as a scattergram. The titles included only those 
available to students. The two sets of ranks were 
tested for correlation using Spearman's r, as in 
Downie and Heath? and Hicks?! whose tables 
were used. 

e) Tutor recommendation A questionnaire was 
distributed amongst the tutors of two colleges of 
nursing. They were asked which journals they 
considered ought to be purchased to support Project 
2000. In order to elicit a reasonable response, the 
survey was kept down to one page and it listed the 
top 20 titles taken from the Citation index except 
the Journal of Professional Nursing as it was 
supposed this might be confused with the British 
publication, Professional Nurse. To reflect British 
practice, these titles were followed on the 
questionnaire by British titles together with a 
leading secondary title, RN which is American. 
Respondents were asked to tick whether these 
titles were regarded as essential/ important/ useful 
or not necessary and the responses were scored as 
follows: 


Opinion Score 
a) essential 3 
b) important 2 
c) useful 1 


d) notnecessary -1 


From this a mean score was given by the following: 


Total score 


mean score = —__—__ 
number of responses 


It was decided that the response ‘not necessary’ 
should have a minus score as this was expressing a 
negative opinion. If no opinion was expressed then 
this had the neutral value of 0. 

Tutors were asked to add any titles not on the 
questionnaire. 
f) Correlation of tutor recommendation and 
student use and citation analysis The rank of 
titles from tutor recommendation was tested for 
correlation with rank in the Citation index and 
student use using Spearman’s r, and scattergrams 
were plotted. 


Results 

a) Citation analysis in Citation Index 

The total number of citations in the Citation Index 
1990 3rd quarter are shown below. The top 40 
titles are shown together with the lower ranked 
British titles. 
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Table 1 — Citation analysis in the Citation Index 


Nursing Research 

Nursing Times 

Journal Nursing Administration 
New England Journal Medicine 
American Journal Nursing 
Heart & Lung 

Oncology Nursing Forum 
Nursing Outlook 

Research Nursing & Health 
Journal Nursing Education 
BMJ 

Journal Advanced Nursing 
Nursing Management 

Lancet 

Image 

Advances Nursing Science 
Cancer Nursing 

Seminars Oncology Nursing 
Journal Professional Nursing 
Nursing Clinics North America 
Nursing Mirror 

Nurse Educator 

Western Journal Nursing Research 
Nursing Administration Quarterly 
Nursing Economics 

Nursing 

Nursing & Health Care 
International Journal Nursing Studies 
Journal Neuroscience Nursing 
Journal Gerontological Nursing 
AORN Journal 

Social Science & Medicine 

RN 

Critical Care Nurse 

Journal Continuing Education Nursing 
Health Visitor 

Archives Disease in Childhood 
Nurse Education Today 

Lower ranked British titles 
Professional Nurse 

Senior Nurse 

Journal Royal Society Health 
Intensive Care Nursing 

Health Educational Journal 
Nursing (UK) 

Nature 

NAT News 

Community Outlook 

Health Service Journal 
Midwife Health Visitor.. 
Nursing Standard 
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Table 3 — Citations in Paediatric Nursing. 


Nursing Times 

Paediatric Nursing 

BMJ 

Pediatric Nursing (US) 
Pediatrics 

Archives Disease Childhood 
Journal Pediatric Surgery 
Nursing (UK) 

Nursing (US) 

Lancet 

New England Journal Medicine 
MCN 

American Journal Nursing 
Senior Nurse 

Professional Nurse 

Nursing Standard 

Journal Pediatric Nursing 
Nursing Research 

Pain 

Journal Human Nutrition 
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b) Citation in the branches 

The citations in journals in the branches are shown 
below. The top 40 titles in the Journal of Advanced 
Nursing, the top 20 titles in Paediatric Nursing, the 
top 8 titles in the British Journal of Mentai 
Subnormality, and the top 10 titles in the Journal of 
Psychosocial Nursing are shown in the tables below. 


Table 2 — Citations in Journal of Advanced Nursing 


Journal Advanced Nursing 
Nursing Times 

Nursing Research 

BMJ 

American Journal Nursing 
International Journal Nursing Studies 
Nursing Outlook 

Advances Nursing Science 
Journal Nursing Education 
Research Nursing & Health 
Journal Nursing Administration 
Nurse Education Today 

British Journal Psychiatry 
Nursing Mirror 

Journal Gerontological Nursing 
Social Science & Medicine 
Senior Nurse 

Heart & Lung 

Lancet 

Nurs Clinics North America 
Western Journal Nursing Research 
New England Journal Medicine 
Nurse Educator 

AORN Journal 

Journal Medical Ethics 

JOGGN 

Nursing Administration Quarterly 
Nursing Forum 

Canadian Nurse 

Journal Psychosocial Nursing 
Journal Continuing Education Nursing 
Topics Clinical Nursing 
Australian Nurses' Journal 
Image 

International Nursing Review 
Journal Professional Nursing 
Oncology Nursing Forum 
Community Outlook 

Health Service Journal 

Nursing (UK) 

Nursing (US) 
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Table 4 - Citations in British Journal of Mental 
Subnormality 

Rank Title Cii ns 
British Journal Mental Subnormality 
American Journal Mental Deficiency 
Mental Retardation 
Journal Mental Deficiency Research 
Childhood Development 
Pediatrics . 
Educ & Train Mentally Retarded 
Community Mental Health Journal 
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Table 5 - Citations in Journal of Psychosocial Nursing 


Rank Title 


Hospital & Comm Psychiatry 
Journal Psychosocial Nursing 
American Journal Psychiatry 
Archives General Psychiatry 
Journal Clinical Psychiatry 
Nursing Research 
Gerontologist 

Archives Psychiatric Nursing 
American Journal Nursing 
New England Journal Medicine 


1 
2 
3 
4 
5 
6 
8 
9 
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c) Student usage and citation 


The table below shows the ranking of journals as 
measured by a) the number of photocopies done by 
Project 2000 students and b) examination of student 
bibliographies. 

There were two publications called Nursing. 
Until early 1992 there was Nursing published in 
the USA and there was Nursing published in the 
UK, but this latter has been renamed the British 
Journal of Nursing. The photocopy survey did 
not indicate which of these was referred to and 
hence the figure for rank was obtained from the 
citation count only. 

The two sets of ranks were used to produce an 
overall rank. Where British and American 
publications had the same rank, the British 
publication was given the higher place. Lower 
tanked titles were omitted as the numbers of 
citations and usage was small. 


Table 6 — Student usage 


Photocopy _ No 
rank ^ Citations 


Nursing Times 

Journal of Advanced Nursing 
Nursing Standard 
Professional Nurse 

Nursing Research 

BMJ 

Nursing Mirror 


Nursing (UK) 

Cancer Nursing 

Midwife Health Visitor... 
Senior Nurse 

Health Service Journal 
American Journal Nursing 
Health Visitor 

Oncology Nursing Forum 
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d) Correlation of student use and rank in the Citation Index 
The ranks of the top 15 titles used by students as 
above were tested for correlation with the rank in the 
Citation Index. 


Table 7 — Rank by student use and Citation Index 


Student Citation Rank within 
rank — Index this set 
Nursing Times 
Journal of Advanced Nursing 
Nursing Standard 
Professional Nurse 
Nursing Research 
BMJ 
Nursing Mirror 
Nursing (UK) 
Cancer Nursing 
Midwife Health Visitor... 
Senior Nurse 
Health Service Journal 
American Journal Nursing 
Health Visitor 
Oncology Nursing Forum 





From the above table Spearman's r, = 0.2535. There is 
no significant agreement at the p«0.05 level. 
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Fig.1 — Scattergram of student use and 
Citation Index ranks 
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e) Tutor recommendation 

64 replies were received from the tutors in the two 
colleges and the results are presented below. The top 
25 titles are shown. 


Table 8 — Rank by tutor recommendation 
Column a: no replies ‘essential’ 


Column b: ‘important' 
Column c: ‘useful’ 
Column d: ‘not necessary' 


Number of tutor opinions Total Mean 
a b c d total score score 
1 Nursing Research 
2 Journal Advanced Nursing 
3 Research in Nursing & Health 
4 Paediatric Nursing 
5 Nursing Times 
6 Professional Nurse 
7 Nursing (UK) 
8 Nurse Education Today 
9 Nursing Standard 
10 Journal Psychosocial Nursing 
11 Senior Nurse 
12 Midwife Health Visitor... 
13 Nursing the Elderly 
14 Health Visitor 
15 Advances in Nursing Science 
16 Journal District Nursing 
17 Journal Nursing Education 
18 BMJ 
19 Archives Psychiatric Nursing 
20 Nursing Management 
21 American Journal Nursing 
22 Western Jour Nursing Research 
23 Cancer Nursing 
24 Nursing (US) 
25 Oncology Nursing Forum 
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f) Correlation of student use and tutor recommendation 
The ranks of the 13 most used titles by students were 
tested for correlation with the ranks obtained from 
tutor recommendation. 


Table 9 — Correlation of student use and tutor 
recommendation 


Rank by Rank by tutors 
student use — overall ір this set 


Nursing Times 

Journal of Advanced Nursing 
Nursing Standard 
Professional Nurse 

Nursing Research 
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Nursing (UK) 

Cancer Nursing 

Midwife Health Visitor... 
Senior Nurse 

American Journal Nursing 
Health Visitor 

Oncology Nursing Forum 
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From the ranks of student use and tutor 
recommendation in Table 9, Spearman’s г, = 0.7043 
at p< 0.05 level there is a significant correlation 
between them. 


Fig.2 — Scattergram of student use and tutor 
recommendation ranks 
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Correlation of tutor recommendation and rank in 
Citation Index 

The ranks ofthe most cited journals in the Citation 
Index were tested for correlation with the ranks 
obtained by tutor recommendation. Medical journals 
such as Lancet and the New England Journal of 
Medicine which are not generally taken by nursing 
libraries were omitted. 


Table 10 — Correlation of tutor recommendation 
and Citation Index 


Citation Index Tutor recommendation 
rank rank гапк in this set 


Nursing Research 

Nursing Times 

Journal Nursing Administration 
American Journal Nursing 
Heart & Lung 

Oncology Nursing Forum 
Nursing Outlook 

Research in Nursing & Health 
Journal of Nursing Education 
BMJ 


Journal Advanced Nursing 
Nursing Management 

Image 

Advances in Nursing Science 
Cancer Nursing 

Seminars in Oncology Nursing 
Western Journal Nursing Research 
Nurse Educator 

Nursing Administration Quarterly 





From the ranks above, Spearman's r, = 0.3631 and at 
the p« 0.05 level there is no significant correlation 
between rank in the Citation Index and rank by tutor 
recommendation. 


Fig.3 — Scattergram of Citation Index and tutor 
recommendation r 
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Discussion 

a) Citation in Citation Index 

Only 5 ofthe most cited journals in the Citation Index 
are British and Nursing Mirror ceased publication in 
1985. From the British point of view some titles that 
rank highly e.g. /mage and the Journal of Professional 
Nursing are not widely held in the UK. This list as it 
stands would not be of use for the purpose of title 

selection for a British nursing library. Popular British 
titles such as Professional Nurse ranked 45th and 
Nursing (UK) - now the British Journal of Nursing - 

ranked 69th and Senior Nurse at rank 55 owe their low 

rankings to under citation in the American literature. 

The American secondary journal, Nursing at rank 26 

and RN at rank 33 were higher placed than their 
British counterparts. However, the British publication 

Nursing Times came second overall. The Journal of 
Advanced Nursing whichisthe leading British primary 
journal of nursing lay in only 12th position reflecting 

the American bias of the Citation Index. 

Some of the highly ranked titles such as Heart and 
Lung and Seminars in Oncology Nursing are specialized 
and are more suitable for advanced students and this 
reflects their citation in the research literature. This 
illustrates the under-representation of review journals 
which are of great interest to students, but are little cited, 
and the high level of citation of primary publications 
which are beyond the level required of basic students. 


b) Citation in the branches 

The top 30 most cited journals in the Journal of Advanced 
Nursing are primary with the exception of Nursing Times, 

Nursing Mirror, and Senior Nurse which have combined 

primary and secondary roles and the Nursing Clinics of 
North America which has long and learned reviews. 

The citations in Paediatric Nursing contain the 
rival paediatric nursing journal titles but MCN in 12th 
position and Journal of Paediatric Nursing in 14th 
position occupied lowerranks than had been anticipated. 

The citations in the British Journal of Mental 
Subnormality under represent its rivals such as Mental 
Handicap Research. 

The citations in the Journal of Psychosocial 
Nursing were almost entirely American and to medical 
and psychiatric publications which would probably 
not be taken by a British nursing library. It is indicative 
of the sad state of British mental health nursing that 
there is no suitable publication for consideration. The 
Community Psychiatric Nursing Journal is the only 
British title in this branch but is specialized and has a 
newsletter function. The only other psychiatric 
publication now being published is Archives of 
Psychiatric Nursing which: ranked only 8th, but 
probably owes itslow rank to having only commenced 
publication in 1987. 


€) Student use 

In examining lists of citations at the ends of student 
Submissions, it became clear that not all were 
genuine for amongst the lowest ranked titles were 
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some which were not available to the students even at 
the RCN or from British Library and these obscure 
titles were ignored. 

The rank obtained from photocopier use at least 
reflects which publications were of sufficient 
importance for students to pay for copying although it 
does not indicate the use to which the copies were put. 

As anticipated, some of the review journals were 
ranked highly by students especially Nursing 
Standard, Nursing (UK) (now the British Journal of 
Nursing) and Midwife, Health Visitor and Community 
Nurse, since renamed Professional Care of Mother 
and Child. These were all lowly manked in the 
Citation Index. 

Overall 2/3 of the total number of mo inthe 
citation count were to the top 10 titles and 90% of the 
references were to the top 30 titles, This does not 
necessarily represent what titles students ought to 
have used. 

The two methods of sampling were combined. Of 
the top 15 titles so listed, 9 are secondary (review) 
journals, 1 is both primary and secondary, 5 are 
primary. The two American secondary journals RN 
and Nursing (US) were ranked lower than had been 
expected. 

That citation counting from citations in the 
literature is not a suitable method for ranking 
publications for acquisition was shown by there being 
a significant difference between rank by student use 
and rank in the Citation Index. 


d) Tutor recommendation 

The ranking of journals reflected the subjective views 
of tutors. The reasons for the non-expression of 
opinions were that some tutors felt that they did not 
know some titles well enough to express opinions. 

Thereasons for regarding titles as being appropriate 
varied. Conversation with some tutors indicated that 
some titles were rated lowly because they are regarded 
as not being of a high enough academic standard. 

Tutors recommended non-nursing titles. The 
foundation course includes biological sciences, 
psychology and sociology. Titles in these subject 
areas are missing in the nursing literature and are not 
indexed so the need for serials in these areas has been 
underestimated in the present study. Examples of 
such publications are Biologist, New Statesman and 
Sociology of Health and Illness. It is clear that there is 
need for further research upon which non-nursing 
serials might be needed to support Project 2000. 
However, the student survey has shown little use of 
them although they were available. 

There was a significant correlation between tutor 
recommendation and student use i.e. students do use 
the journals they ought to use in general but there are 
some discrepancies e.g. Research in Nursing and 
Health was ranked 3rd by tutors but was not used by 
the students surveyed. 

The unsuitability of using fhe Citation Index 
rank for journal selection was confirmed on a 
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subjective, but expert, basis by the tutors as there 
was no significant correlation between the rank of 
titles obtained in the tutor survey and the rank of 
titles in the Citation Index. 


Conclusion 

In order to produce an overall rank of journal titles 
taking into account the structure of the literature, use 
by students themselves and recommendation to reflect 
what ought to be used, the ranks produced by the four 
methods described in this study were combined. A. 
model is presented below. 


Fig.4 — Model for overall ranking of journals 













Citation in 
Citation Index 






Tutor | 
recommendation 





In order to produce an overall rank, the ranks obtained 
from the four methods were converted into scores and 
added together. In order to counterbalance the 
American and research bias of the Citation Index 
rank, a score equivalent to rank was only given down 
to rank 20 so the lower ranked publications all were 
given a score of 20. This also counteracted the fact 
that the Citation Index had produced the longest list 
and reduced the figures to levels comparable with the 
other methods. ) 

For citation іп the branches, a score equivalent to 
the rank of each title in a journal representing each of 
the branches was given. However ifa title was in two 
or more rank lists then the score given was equivalent 
to its highest place. For example BMJ was ranked 4 in 
the Journal of Advanced Nursing but ranked 3 in 
Paediatric Nursing so the score given was 3. 

For the student rank, the score was an average of the 
photocopy and bibliographic citation ranks, However, 
owing to small numbers the lowest score given was 16. 

The rank for tutor recommendation was used 
down to rank 30. If a title had been written onto the 
questionnaire then the score which would have been 
30 was improved by a figure of 3 per recommendation. 
For example, Paediatric Nursing was written onto 3 
questionnaires by tutors so its score became 30 - (3x3) 
-21. 

The allocation of a minimum score does mean 
that the lowest rank titles can not be put in rank order 
but the objective of the present study was to consider 
the most important titles. 
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The overall ranking of journals is shown below. 


Table 11 — Overall rank of nursing journals 
Column a: rank from Citation Index 
Column b: rank from journals representing 
branches 
Column c: rank from student use 
Column d: rank from tutor recommendation 


Nursing Times 

Nursing Research 

Journal Advanced Nursing 
М7 


B 
Research in Nursing & Health 


— 


Nursing now Brit Jour Nursing 
Professional Nurse 

American Journal of Nursing 
Nursing Standard 

Journal Psychosocial Nursing 

Senior Nurse 

Paediatric Nursing (UK) 20 
Advances in Nursing Science 16 
Nurse Education Today 20 
Jour of Nursing Administration 3 
British Jour Mental Subnormality 20 
Pediatric Nursing (US) 

Archives Psychiatric Nursing 
Journal Nursing Education 

Mental Handicap 

Nursing (US) 

Health Visitor 

Heart & Lung 6 
Internat Journal Nursing Studies 20 
Midwife Health Visitor... 20 
now Prof Care Mother & Child 
MCN 20 
Oncology Nursing Forum 

Cancer Nursing 
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The list was not extended further to include lower 
ranked publications as they owe their places to a small 
number of citations. 

Taking the top 10 titles, it is interesting that the 
highly placed Research in Nursing & Health and 
Advances in Nursing Science were not used by students 
although available in the colleges sampled. The pre- 
eminence of Nursing Times and Journal of Advanced 
Nursing from the British point of view was confirmed 
by this rank table. The BMJ, although medical, ranked 
highly. The American serial, Nursing Research had a 
highrank but some ofits contributions are rather obscure 
from the point of view of basic students, The effect of 
having a floor score of 20 in the Citation Index rank was 
that British publications with a very low score in the 
index achieved a reasonable rank in the overall list. 

It has to be emphasized that only nursing titles 
were considered and that there is need for study into 
the non-nursing serials needed. The time over which 
student use was measured could have been greater 
and ideally more bibliographies should have been 
examined but this proved to be a sensitive issue. It 
is hoped, however that the present list is of interest. 
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When I began to consider this topic I was struck by the number of times I wrote the word ‘change’ in my 
preliminary notes, and change is really the theme or leitmotiv which runs throughout this paper. I shall look first 
at a number of issues which are changing the use of information and the nature of information work. Those 
changes have implications for the future information workforce and for the education and training of that 
workforce. Next, Ishall briefly describe the way in which the main providers of information education and training 
are themselves changing to meet the new demands. Specifically, I shall discuss developments within library and 
information schools, the provision of continuing education against the ever present backcloth of reduced budgets 
and inadequate funds, and finally I shall touch on the development of the New National Vocational Qualifications 


in library and information work. 


Any discussion of the library and information education 
and training scene in the United Kingdom is, 
necessarily, a discussion of change. 

The very nature of information work is being 
altered by changes which are taking place in the wider 
social and political environment: the increasing use 
of information technology in all sectors of business 
and society; the diffusion of information handling and 
retrieval skills across the whole spectrum of the 
workforce; and changes in the structure of 
organizations and in the way in which people work. 
These changes are all creating a situation in which the 
end-user is becoming more directly involved in using 
information. 

. Allofthis has profound implications for the library 
and information workforce and for the education and 
training of that workforce. 

Take technology as an example. Developments in 
telecommunications and networking are changing 
access to and distribution of information, and creating 
new, more flexible work structures. 

New ways of storing and disseminating information 
ate creating a demand for real time information. IT 

_has changed the way ordinary, non specialist, end 
users approach information. Expectations rise ever 
higher and information workers now have to respondto 
aquantitively and qualitively different level of demand. 

Meanwhile, the increasing sophistication of 
systems such as intelligent front-ends for online 
databases; expert systems; and executive information 
systems reduce the need for an information 
intermediary to interface between the system and the 
end-user. | 

This trend has been reflected in the type of people 
who come on Aslib training courses. We train more 
and more people from outside the traditional library 
sector in what were once the mystic skills of 
information science. 


And this impact of technology on the work of 
traditional information professionals is unlikely to 
diminish. After all, a generation of children are about 
to leave school and enter the workforce for whom 
information skills have been part of the core curriculum 
— alongside numeracy and literacy skills. These 
information and computer literate workers of the 
future will make radically different use of the corporate 
information resource. 

Information skills and expertise are already widely 
spread throughout the workforce, Indeed, some much 


_quoted commentators have defined 50% ofthe workforce 


in post-industrial economies as information workers. 
This, combined with the increasingly user friendliness ` 
of electronic information, will inevitably reduce the 
need for the traditional information intermediary. 

Another fundamental change, and one which is 
inextricably linked with technological developments 
and the diffusion of information skills, is the trend 
towards decentralization of the information power 
base. IT enables information to move rapidly up and 
down, and in and out, of the organization. This 
facilitates the development of flatter, less hierarchical 
organizational structures and of more flexible working 
arrangements. In the commercial sector this 
development has been driven by business needs. 
Competition and the need for quality control require 
that management information be made available 
quickly, and kept near to where the everyday business 
decisions are being made. 

In the UK public sector the move towards 
decentralization has been driven to some extent by a 
demand for democratization and quality control, but 
more generally, it is the unintended result of the 
political showdown between our Central Government 
and Local Government. 

- Yet another major challenge for the information 
profession may come from demographic changes in 
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the potential workforce. Significant demographic 
changes are already having an impact across Europe. 
By 1995 there will be 30% fewer school leavers in the 
UK. The UK Government intends that, by the year 
2000, the number of school leavers who go into 
higher education will be increased from the present 
figure of about 15% to 20%. But, even with this policy, 
the demographic dip will have an effect on professions 
such as librarianship whose members require high- 
added value at the early stages of their careers. 

The problem will be exacerbated for the library 
and information sector because it will have to compete 
for the best candidates against other professions, such 
as law or finance, where the rewards are very much 
greater in terms of both salaries and status. 

The poor image of the library and information 
worker, which has been an issue for the past twenty 
years, will become even more significant as 
recruitment difficulties increase. 

The present high level of unemployment in the 
UK goes only a certain way to alleviate the potential 
recruitment problems. 

Aslib Professional Recruitment Ltd, Aslib's 
specialist recruitment agency, presently has the highest 
number of job seekers on its books since the early 
1980s. Yet already many of those candidates do not 
have the skills which employers are looking for. This 
mismatch between the available workforce and 
employers' requirements can only be addressed in the 
short term by training and continuing education. 

I have discussed just some of the issues which are 
bringing about changes in the field of library and 
information work and creating an increasing mismatch 
between the skills of the available workforce and the 
requirements of employers. 

Itis the job of the providers oflibrary and information 
education and training to fill this skills gap. But, in order 
to do that successfully they too have had to change. 

The main providers of information education and 
training in the UK are the 17 schools of library or 
information studies, professional associations and 
commercial companies. 

Essentially, training takes place at three levels; 
initial vocational education, continuing education and 
on-the-job training which is carried out in the 
workplace to teach the skills and tasks related to a 
particular job in a particular context. 

Initial vocational education for professional 
information staff will, typically, be a vocational degree 
either at graduate level or a post-graduate qualification 
supported by a first degree in another discipline. The 
vocational qualification will usually have been gained 
at a school of information or library studies. 

These professionals work with support staff, who 
may or may not be graduates in other disciplines, 
who may have vocational qualifications at para- 
professional level. 

This initial education is, in theory at least, 
supplemented by continuing education and training 
to keep up to date throughout the career. 
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Continuing education and training activities 
include conferences, seminars and professional 
meetings and discussions as well as more formal 
training activities such as short courses, workshops, 
lectures and in-service training. 

The major providers of this training activity are 
the various professional bodies, such as Aslib, which 
runs the largest programme of short courses on 
information management in the UK, but a very 
significant proportion of training is supplied by 
commercial companies from inside and outside the 
information industry, who are responding directly toa 
perceived market demand. 

The higher education sector, in line with the UK 
Government's more ‘entrepreneurial culture’, is being 
required to generate income and so many institutions 
are introducing short courses or other training events. 
Colleges and universities have a very important 
contribution to make by pioneering developments 
such as distance and open learning using the experience 
and facilities available within their institutions. 

For the UK library schools meeting the challenge 
of change has not been easy. Changing any 
institution is a slow process. However, the library 
schools are now very different from the schools of 
even a decade ago. 

Many of them have been repositioned within their 
institutions and most of their courses are now taught 
within faculties of either computing, business and 
finance or communication. Library and information 
departments now include an increasing number of 
staff whose academic and professional roots are not in 
library and information science. 

Curriculums have had to be re-examined to take 
account of the professional, technological market, 
and indeed, ideological changes which have been 
such a feature of the last decade. Courses based on 
traditional librarianship, which combined document 
management skills with the public service ethic, have 
given way to courses on information management. 
These courses, in Tom Wilson's words, focus on: ` 

‘the effective management of the internal 
and external information resources of 
an organization through the proper 
application of information technology.’ 

Information here is defined as: ‘an important 
economic resource and IT is the tool for its effective 
management.’ i 

But such changes, which go to the heart of the 
curriculum, have not been effected without 
controversy. Among academics there is not universal 
agreement as to the validity of courses which, as 
Muddiman of Leeds Polytechnic puts it in an article in 
Personnel Training & Education (7(3) 1990): 

*cluster around a notion of information 

‘that heavily relates to organizational 
efficiency and commercial competitive 
advantage.” 

At undergraduate level particularly, Muddiman 
questions whether courses based on labour market 
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criteria, and designed for employment niches which, 
in any case, may well come later rather than earlier in 
the professional career, and which are tied to a 
particular ideology are the best way to facilitate broad 
educational objectives. 

And as new elements are introduced into the 
library schools so other subjects which have 
traditionally been taught have to give way. As the 
word library disappears from many, though not all, of 
the titles of academic departments so many of the 
librarianship specialisms are disappearing from the 
curriculum — specialisms such as music librarianship. 
The library schools attempt to address this problem 
by offering modular courses, but there is no doubt that 
it is increasingly difficult for the academic sector to 
provide for minority professional interests in a 
systematic way. 

And it is not just the traditional specialisms which 
are missing from the information curriculum. There is 
widespread agreement about the need to increase the 
level of management education for information 
professionals. 

Sylvia Webb’s 1991 research into continuing 
professional development for information staff (British 
Library R & D Report 6039) led her to suggest that if 
library and information schools intend to cater for the 
markets of the future they will need to emphasize the 
importance of management and decision making skills, 

Similarly, the UK Library Association’s Education 
Committee (in its response to the British Library 
Information UK 2000 Report), has called, inter alia, 
for training in: 

‘high level and ongoing management 
skills including political, financial [&] 
personnel management’. 
and has acknowledged the need for: ‘more ambitions/ 
management-orientated recruits with transferable skills.’ 

In an exchange of letters published in the 
professional press in August 1992, several library 
schools responded to criticism of their narrowing base 
of specialisms by explaining that it is the responsibility 
of employers to provide training in such areas. There 
can be no doubt that training for modern information 
work is an on-going process. It is no longer possible, 
if it ever was, for initial education to teach all the 
skills, knowledge and attitudes which will be required 
throughout a successful career. As the role of the new 
information manager becomes broader the need for 
continuing education becomes greater than ever. The 
distinction between initial education and continuing 
education is becoming blurred and training is becoming 
even more essential. 

Not unnaturally, then, alongside the changes which 
we have seen taking place in the library schools, some 
of which I have briefly described, the provision of 
continuing education and training has also evolved in 
response to changes in the information marketplace. 

Perhaps the most interesting and heartening change 
has been the remarkable growth in the number of 
training events available to information workers. 
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Margaret Slater, in an article in the Journal of 
Information Science 17 (1991) shows that while the 
provision of short courses and seminars remained 
fairly static between 1971 and 1985, between 1985 
and 1989 the number of events rose to nearly ten 
times the number available in 1971. 

This increase in provision must indicate an increase 
in the level of training activity being undertaken by 
information workers. 

Certainly, the number of people attending Aslib 
training courses has risen steadily between 1984 
and 1991. 

Research by Lewins ef al during 1988, reported in 
Personnel Education & Training 8(2) 1991, established 
that 86% ofrespondents to a questionnaire had attended 
at least one training event during the previous year. 

Notwithstanding this growth, however, the 
continuing education and training scene in the UK 
remains unsatisfactory because insufficient funds are 
made available for training. In spite of repeated 
Government statements onthe importance oftraining, 
UK industry has no tradition of investing in its 
workforce or in the national skills base. 

Across the board managers complain that their 
training budgets are too small, Slater, in her Research 
into Investment in Training, quotes a manager from a 
public library service, who said: 

*Training is an investment in the most 
expensive resource that the library has! 
Computers, for instance, we spend 10% 
of the cost of computer systems on their 
annual maintenance. But we do not spend 
1096 of the £3 million staff bill on staff 
maintenance in the form of training..." 

According to Slater, total training budgets can 
range from as little as £24 per person per year in a 
public library system to £2,000 a head cited in one 
commercial organization. 

Typically where training budgets exist these are 
likely to be about £250 to £300 per person per year. 

Lewins' article quotes 5096 of respondents as 
stating that their skills had not been updated by 
training courses which they had attended — the training 
had failed them. In the very next paragraph, however, 
it states that nearly half of these training courses were 
free of charge and over a quarter cost less than £40. One 
has to ask, what did these trainees expect for that price? 

Top quality training costs money to develop and 
provide and genuine quality control will only be 
achieved if there is an implicit contractual relationship 
between the trainer and the trainee — and if the trainee 
can go elsewhere if need be. The fact that so many 
people in the UK information community still 
complain that they have no funds for training reflects 
the low priority that training still has among employers. 

Not all managers of library and information 
services have direct control of the budget for training 
their staff. 

In many organizations training budgets are held 
centrally by a personnel or training department, and 
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there are still many organizations with no formally 
allocated funds for training at all. In this latter case 
money for training, if it is made available at all, is 
simply transferred from other budget heads according 
to the individual manager’s definition of immediate 
priorities. 

Interestingly, Slater’s research suggests that 
managers of library and information services could 
well be better off without a formal departmental 
training budget. As Slater puts it: 

‘Managers who did not have a separate 
training budget, who just drew on the 
demand principle froma central training 
budget, seemed to fare better. 
Reasonable requests were accorded to: 
few problems were experienced.’ 

Certainly, the decentralization of budgets has been 
perceived as a way of cutting overall corporate 
expenditure, particularly in the public sector. 

Even when it is possible to identify a discreet 
training budget for use by the library or information 
staff there may be a degree of ambiguity as to exactly 
what this budget must cover. Some budgets may not 
include attendance at conferences — which are not 
always definedas training. This may make conference 
attendance easier in some organizations, but for others 
paying conference fees may be entirely out of the 
question — no matter how practical the conference. 

But, training budgets are not the only problem. 
Even within organizations which do have adequate 
training budgets information staff still have problems 
getting time off for training. 

Indeed, it is a common remark among UK library 
managers that their main problem in training their 
staff is not the direct cost but the staff time required. 
Days out on training courses or regular day-release 
arrangements can impose a severe strain on library 
and information units which are likely to already be 
running at minimum staffing levels. 

This problem will be particularly acute for one- 
person-librarians — professional library or information 
staff who run their service alone or with only clerical 
or secretarial back-up. By definition they have no-one 
to cover for them while they undertake training. 

And during an economic recession like that which 
we are presently experiencing in the UK investment 
in training is reduced even further. In an economic 
culture which sees training as a low priority at the best 
of times this is one of the first areas to be cut when 
times get hard. Figures published by the Department 
of Employment in the Employment Gazette, August 
1992 show a 196 drop inthe number of people receiving 
training as early as spring of 1991. The likelihood is 
that this will have decreased further and faster since 
then because the recession has continued longer and 
been more severe than many analysts and employers 
expected. 

The effects of the recession, coupled with 
government policies for ‘reform’ of the public sector 
during the 1980s, have put severe pressure on the 
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public sector where much of the library and 
information activity is concentrated. Massive 
programmes of change in areas such as the National 
Health Service have created skills shortages and a 
need for more, not less, training. 

Inthis difficultclimate, with resources so stretched, 
it is in a sense reassuring to see that a fair amount of 
continuing education and training is still undertaken 
and much interesting work is being done on the 
training front — both in terms of the variety of courses 
being offered and in the use of new modes of offering 
training — educational technology, distance and open 
learning and better access for groups such as women 
returners. 

Undoubtedly, the most significant and far reaching 
change in UK information training will spring from 
work currently being undertaken to develop the new 
National Vocational Qualifications, or NVQ as they 
are known. 

Intheearly 1980s the UK Government recognized 
that our poor skills levels, across the workforce, have 
anegative impacton competitiveness. It was therefore 
decided to completely overhaul the provision of 
vocational training. A new scheme of National 
Vocational Qualifications, designed to provide 
comparable, transferable vocational qualifications for 
virtually everyone in the workforce throughout the 
country is being developed. 

Work has been underway for two years now to 
develop National Vocational Qualifications in library 
and information work. NVQs will be radically different 
from existing qualifications in a number of ways. 
Firstly, access to the training is completely open and 
the process of training and assessment is separated. 
Secondly, the qualifications will be based on the 
functions which are carried out in the workplace and 
standards are defined by practitioners rather than 
trainers or educators. This is unlike existing training 
which is largely defined by the course syllabus. 

The standards will enshrine 'good and best 
practice', rather than actual practice as with existing 
internal training. And so, in theory at least, standards 
will be raised overall. 

Over the next three years NVQs will be created for 
all levels of work from clerical to professional and 
managerial. 

Not surprisingly, the introduction of a whole new 
qualification structure is proving most controversial 
at professional level, and it has to be acknowledged 
that a number of very important issues have yet to be 
resolved. 

Library and information employees, educators 
and trainees will have to consider, for example, which 
existing vocational qualifications.are to be included 
in the new structure, and what of the rest? 

At present library and information work is a 
graduate entry profession. But this does not fit well 
with the National Vocational Qualification principle 
of open access. Will the open access policy have a 
damaging effect on the status of the profession? 
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Then there are issues which arise from the particular 
nature of information work — issues of process versus 
product. If the emphasis of NVQs is on products and 
outcomes, what you do, how do you measure how 
well you do it, the performance. 

And finally there remains the very large question 
of how one is to build the body of knowledge, which 
is the foundation of the profession and of the 
professional education, into qualifications which are 
essentially competence rather than knowledge based. 

Notwithstanding these outstanding questions which 
have yet to be resolved there is a marked degree of 
goodwill towards the NVQs among information 
professionals and a determination to make them work. 

There is no doubt that the information community 
is well aware that its future success depends, perhaps 
more than any other single factor, on how well its 
education and training programme can create a 
workforce which can meet the challenges and 
opportunities inherent in working with information. 

Library and information workers can no longer 
afford to limit themselves to their traditional, reactive, 
service role. Much of the repetitive work involved in 
traditional librarianship is now done by computers. 
Library automation software can be used to catalogue 
stock and to manage acquisitions, loans and journal 
circulation. OPACs now allow users to browse through 
the catalogue and find their own information. 

Information professionals are having to abandon 
their old role as passive gatekeepers and adopt a new 
proactive role as facilitators — designing, organizing 
and managing the systems which enable users to 
access information directly. 

The best of the new information professionals 
understand the strategic role of information within the 
organization, they are redefining their own function 
and are not afraid to create value added end products 
which directly address their clients’ needs. 

Ultimately the profession needs, and will continue 
to need, a workforce with the necessary skills base 
and, as important, with the right personality and 
attitudes — people who are flexible, entrepreneurial 
and receptive to change. 

The future success of the library and information 
profession will depend on a recognition of the 
implications of the dynamics of the new information 
environment on the workforce who serve the needs of 
those who use information. 
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This in turn will depend on the extent to which 
library and information education and training can 
respond adequately and swiftly to the changing 
environment. 
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Introduction 

I would like to discuss the implications of recent NHS 
reforms for the provision and use of information in 
healthcare—particularly in the National Health Service. 

In July this year the British Library R&D 
Department organized а seminar entitled Health Care 
Information in the UK and invited a group of 
information professionals from across the healthcare 
spectrum, as well as senior managers from the NHS. 
The seminar provided an opportunity for NHS staffto 
give their views on healthcare information provision 
and use in the context of information and library 
services provided by the British Library. 

The title of my paper today is inspired by a paper 
given to the BLRD Department's seminar by Dr 
Christopher Bentley, Director of Public Health and 
Strategy, Worthing District Health Authority, entitled 
Health of the Nation in which he identified a 
fundamental change in the way in which healthcare is 
provided, and the implications for information 
provision. 

„ofall the changes that have occurred within 

the National Health Service over the last 10 

years, the most fundamental... occurred during 

the recent NHS reforms. Health authorities 
previously had responsibility for the 
healthcare needs of a defined population. 

This has always been the thrust of the public 

healthcare approach, but not necessarily that 

of planners of service. On the whole many 

services have developed incrementally and 

have been shaped perhaps more by the interests 

of the medical staff, however well meaning, 

than by prioritising the unmet needs of the 

population at large. 
Dr Christopher Bentley (1992) 


The fundamental change referred to by Bentley is 
a switch from а service-led activity to a (more 
appropriate) needs-led activity to service planning 
and provision of healthcare. The implications for 
information provision and use are enormous. 


Reluctant information users 

With the introduction of general management in the 
early 1980s (I will return to this in a moment) an 
information revolution began in the NHS. The seeds 
were sown for a different way of managing the NHS 
— put rather crudely, there was a movement away 
from 'seat of the pants management' towards 
the more systematic management based upon data 
and information. 


In the 1980s most NHS managers did not want 
or welcome information, and the accompanying 
IT revolution. Indeed, many were actively resistant 
to the introduction of a new generation of 
information systems. 

On the other hand, medical scientists, physicians 
(particularly those involved in research, and 
monitoring of activities), postgraduate medical 
researchers, and many other relatively isolated groups 
within healthcare have always wanted to work in an 
information-rich environment have been in the 
minority. But the desire to work in such an environment 
has been with us for a long time. 

I look forward to such an organization of the 

literary records of medicine that a puzzled 

worker in any part of the civilized world shall 

in any hour be able to gain a knowledge 

pertaining to a subject of the experience of 

every other man in the world. 
Dr George M. Gould (1988) 


Carmel (1992) points out, in the context of what 
he calls the ‘Gould Standard’ that the mission of 
information and library services has not changed: the 
objective remains to ensure that everyone engaged in 
healthcare has access to the knowledge they require to 
carry out their work effectively. 

Librarians of course have for a long time provided 
library and information services pertaining to medical 
science and related fields. But during the last 10 years 
another totally separate information activity has been 
taking place: that is, information for management. 
Until recently most NHS librarians have had relatively 
little input into this new activity. I will return to some 
of the problems confronting the NHS library service 
later, but first let us look in a little more detail at the 
information needed for management. In so doing I 
hope to show how it is that a change of attitude has 
come about on the part of most managers in the NHS 
during the last few years. 

For most of the 1980s management made 
little use of management information systems 
that were gradually being put in place. In the 
1990s management has a much greater appetite 
for information, even through it is restricted to 
relatively few types of information. I hope to 
show how this change of attitude has come about. 
The technological revolution in IT of course is 
relevant, but really it is an attitudinal change 
brought about by various pieces of legislation. 
that has had the major impact upon-managers, ~ 
and their new found appetite for information. : 
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Definitions 

A few words of explanation about the usage of ‘data’ 
and ‘information’ in the NHS are in order. The term 
information can be used in its widest sense: that is, to 
include all types of numerical and textual data. Textual 
information comprises all forms presented as text 
(including books, journals, reports, circulars, letters, 
etc). Numerical information comprises all forms 
presented as numeric data (including minimum 
datasets, population data, statistics, surveys, 
epidemiological reports, etc). 

Looked at in a crude way, we can say that the 
usage of the term ‘textual data’ in the NHS now refers 
to what we would call documented information — or 
perhaps documented knowledge; whereas numerical 
information refers (for the most part) to both patient- 
generated data and other internally generated data of 
interest to management. 


NHS Reforms 

Underlying the revolution in information provision 
and use is a substantial programme of legislation, 
official reports, consultative papers and 
recommendations, Since 1980 there have been nearly 
19,000 official publications concerned with health. A 
handful of these have had a profound effect upon 
information provision and use in the NHS. 

The 1979 Royal Commission on the NHS made 
many recommendations: we can highlight a need for 
medical audit, a review of nurse education, the 
improvement of hospital administration, a greater 
contribution by medical staff to the management of 


hospitals, and the simplification of the ыы 


structure below regional level. 

The 1983 NHS Management Enquiry (led by Sir 
Roy Griffiths) was responsible for bringing in general 
management, with attention to the levels of service, 
quality of product, meeting budgets, costimprovement, 
productivity, motivating and rewarding staff, researc 
and development, and many other matters. 

Following the Royal Commission, and at the same 
time as the NHS Management Enquiry was being 
conducted, another major body concerned with 
information was established in 1980 — the joint NHS/ 
DHSS Steering Group on Health Service Information 
(more commonly known as the Kórner Committee. A 
series of reports resulted between 1980 and 1984, 
dealing with aspects of activity and workload measures, 
manpower, and financial information. The Kórner 
Group emphasized the need to improve management 
information systems, particularly by simplifying and 
standardizing minimum data requirements. 

In retrospect, it can be seen that one of the 
weaknesses of the Körner recommendations was the 
emphasis upon the provision of information rather 
than upon the use of information. It was not for many 
years, and after many other reports and legislation, 
that managers were enabled to make use of the 
information which gradually became available 
following the Kórner reports. In fact there was quite a 
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delay in implementing Kórner because it was not 
mandatory upon health authorities to collect Kórner 
(in the first round) until April 1987, and to complete 
the second phase of Kórner implementation by April 
1988. but even as late as 1990, and in the face of 
Kórner data available to all managers in the health 
service, relatively few were making any productive or 
constructive use ofthe data. Many people in the NHS, 
ranging from staff involved with data capture through 
to senior management regarded it as more ofa nuisance 
and an obligation, rather than of help and value to 
their own particular work. 

In 1986 the United Kingdom Central Council for 
Nursing, Midwifery and Health Visiting (UKCC) 
published Project 2000: a new preparation for 
practice. This resulted in a major reform of nurse 
education, with a significant reduction in time spent 
on ward duty and a shift towards classroom based 
education and an emphasis upon self-directed research, 
aimed at producing nurses as health practitioners able 
to make decisions ‘based on strong theoretical 
knowledge (and one might also add data and 
information) combined with practical skills. 

The changes in management practice initiated by 
Griffiths were further developed and expanded in 
Working for patients published in 1989. Many regard 
this as the most significant review of the NHS in its 40 
year history. Indeed, many people refer to this as the 
NHS review — although there have been others since 
which have also had an important bearing upon 
information provision and use. The implications of 
Working for patients for a new generation of 
information systems was indeed spelt out in Working 
Paper No. 11, which formed part ofa series of working 
papers following the publication of Working for 
patients, Working Paper No.11 A framework for 
information systems set out the short-term goals for 
the NHS, detailing information systems that had to be 
in place by 1 April 1991: But it also included longer 
term aims, including the building of an infrastructure 
for information systems to facilitate developments 
such as integrated hospital support systems, resource 
management, and medical audit. 

However, it was some time before a critical mass 
of NHS staff realised the full implications for 
information systems of Working for patients. We can 
see now, in retrospect, that the key changes which 
were to be implemented by 1991 nearly all had 
important implications for information systems. In 
brief, the recommendations included: 

ө A delegation of responsibility and power to 

local level 

€ The establishment of self-governing NHS 

hospital trusts 

e Animprovement in waiting times and quality 

of service enabling GPs to apply for and have 
control over their own budgets 

€ A reduction in size and a reorganization on 

business lines of regional, district, and family 
practitioners bodies 
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e Anextension of medical audit 

e Improved information managers and 
professional staff — this included the 
development of the resource management 
initiative to link up information on diagnosis 
and cost of treatment, in order to provide a 
complete picture of resources used in the 
treatment of hospital patients 

e Improved information for patients, such as clear 

leaflets explaining facilities and services 
available, and information on aspects of 
healthcare such as admittance to hospital. 

I leave until last, mention of the most important 
change which allowed all NHS hospitals to offer 
services to any health authority (and also to the 
private sector); this is the recommendation that has 
led to the so-called purchaser/provider split in the 
NHS, and the resulting need for health authorities to 
effect contracts between themselves. The information 
required to effect contracts, and subsequently to 
monitor and evaluate them, cannot be underestimated. 

Other important documents issued in the 1980s 
include Primary Healthcare: an agenda for discussion, 
which led to the 1987 White Paper Promoting better 
health and followed in 1988 by the Health and 
Medicines Act. These directives were intended to 
improve value for money, increase the choice of 
services available to patients, and to make the services 
more accountable to consumers. 

The 1988 report to the Secretary of State for 
Social Security Community care: agenda for action 
reviewed the ways public funds were used to support 
community care policies, and advised on options for 
action to improve the use of funds for more effective 
community care: three areas were highlighted — the 
importance of health promotion, local provision of 
general information for providers of community care, 
andanincrease inchoice and range of options available 
to patients. Further directives were given in the 1989 
White Paper Caring for patients. 

The publication of Government and Department 
of Health White Papers and strategies continued into 
the 1990s. The 1991 discussion document Health of 
the Nation put forward themes which have implications 
for information provision, they are: 

e the identification of key areas for improvement 

@ securing genuine improvements by target setting 

and monitoring progress 

ө the improvement of knowledge and 

understanding in order to review and reappraise 
priorities and targets 

@ the development ofa public health information 

strategy, with an emphasis on monitoring, 
effectiveness, and health outcomes. 

Finally, I will refer to, although leave the details 
for Bob Gann to discuss, the 1991 The Citizen's 
Charter which set out the mechanics for improving 
choice, quality, value and accountability. This was 
followed by the better known 1991 The Patient's 
Charter which spelt out new rights for patients, which 
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include a limit to waiting times for treatment to a 
maximum of two years, complaints procedure, and 
information for patients on local health services 
from health authorities, GPs, and the Community 
Health Council. 

The extensive and far-reaching NHS reforms (as 
detailed in these various publications) has been to 
stress the need for extensive, comprehensive, 
accurate, and up-to-date information (of all types) to 
support the work of NHS staff at all levels, and in 
addition to provide information for healthcare 
consumers. 


Types of Information Need 
There are four major categories of information need: 

1. Scientific, clinical, and health services 

information 

2. Patient-generated clinical data 

3. Corporate activity management information 

4. Information for patients, carers, and the general 

public. ; 

Examples of category 1 include scientific, clinical, 
and health services information in documented form 
— ie. journals, monographs, abstracting and indexing 
tools, government reports and directives, CD-ROMS, 
online information services, dial-up knowledge bases, 
computer-based knowledge systems, reference books 
and medical textbooks, and a whole array of NHS 
circulars, letters and documents, policy statements, 
statistical reports, thesis and dissertations, government 
circulars, and so forth. 

Examples of category 2 include medical records 
which contain vital information on patients’ personal 
details such as age, sex, date of birth, marital status, 
GPs and consultants’ names, admissions to hospitals, 
wards, and details of discharge. This information 
forms the major part of the minimum data sets 
recommended by Kórner. Further examples of 
category 2 include diagnoses, examinations, radiology 
and diagnostic imaging, preventive procedures, 
operative procedures, other therapeutic procedures, 
and drugs and appliances. 

Examples of category 3 refer mainly to internally 
generated throughput data (usually numerical) plus 
some externally produced statistics. Examples 
include statistical information on service utilization 
and costs, waiting lists, number of beds and 
occupancy rates, theatre schedules, staffing and 
salaries information, demographic statistical 
information, reports on survey data produced by 
health authorities and others, epidemiological 
reports, and much more. 

Examples of category 4 include information on 
waiting lists, how to complain, leaflets and booklets 
on specified conditions, information about self-help 
groups and voluntary organizations, information on 
how to maintain and improve health, information for 
ethnic minorities, and targeted information for other 
groups such as young people and children, people 
with learning/reading disabilities, and so forth. 
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The users 
There are four main user groups. 

Providers are those who provide healthcare 
services: that is, doctors, nurses, therapists, 
radiographers, psychologists, dieticians, pharmacists, 
dentists, medical scientists, GPs, health visitors, and 
other community health staff. Providers have an 
enduring need for information, help and advice on 
continuing education and training. It is vital they keep 
up to date with new trends, developments, and 
innovations in their specific areas, in order to undertake 
informed and effective research and teaching, and 
make appropriate clinical decisions. In addition, 
increasing emphasis is placed upon medical audit, 
particularly in the context of improvements in the 
quality of healthcare. This necessitates continuously 
updated textual information and recent developments 
and innovations in healthcare delivery as well as 
patient-generated data and the hospital activity 
throughput data referred to earlier. 

Purchasers represent a relatively small group of 
NHS staff, albeit a powerful group, since it is 
responsible for buying healthcare services. Purchasers 


_ include health authorities, family health service 


authorities (FHSAs) and GP fundholders (GPFHs), as 


well as insurance companies. Purchasers are charged : 


with purchasing healthcare on behalf of their resident 
populations (be it a district health authority or those 
constituting the list of GPs). Purchasers have to make 
far-reaching decisions on healthcare policy, priorities, 
and provider services, as well as staffing and financial 
management decisions. These decisions have to be 
based upon an in-depth knowledge ofthe health needs 
assessment of local resident populations, the quality 
of healthcare and outcomes assessment, public opinion, 
and government policy. 

The 1991 Department of Health publication The 
Health of the Nation required health authorities, and 
in particular, the so-called purchasing authorities to: 

@ assess the state of health of the people they 

serve, and what needs to be done to improve it 
€ set priorities for improvements 

@ purchase effective services to meet these needs 


e stimulate an informed discussionofactionneeded ` 


at local level to address wider health issues 
@ work in cooperation with each other and other 
agencies in taking effective action on health threats 
€ assess the effects of policies and programmes 
in terms of health improvements 
Learners, educators and researchers form the third 
group of users, where information needs are becoming 
more complex and more wide-ranging. Undergraduate 
medical education has always taken place in an 
information-rich environment, and increasingly the 
same is so for nurse education, medical records 
personnel, and many other groups, including managers. 


The 1991 Research and Development Strategy for 


the NHS is concerned with ensuring that research and 
development becomes an integral part of healthcare. 
How are the results of existing research better 
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disseminated to practitioners? How is the NHS to 
decide upon priorities for research in the absence of - 
comprehensive databases of existing research and 
indicators of its efficacy in practice? These and many 


` related questions are only now just being addressed 


and will provide further demands upon information 
and library services both within, and increasingly, 
outside the NHS. 

Consumer health information — the fourth 
category of users — is last іп this list, but is perhaps 
the fastest growing area of healthcare information 
provision. The increasing demand for healthcare 
information is not restricted to patients, but extends to 
all members of the public and covers all types of 
healthcare. This includes those being treated by 
doctors, but also people receiving therapy or 


. community care of all kinds, as well as their carers 


and those making a decision as the options, risks, and 
benefits oftreatment, better diets and lifestyle, joining 
self-help groups, complaints procedures, or claiming 
welfare rights and benefits. Consumer expectations 
generally are now much higher than a decade ago and 
the type, level and depth of information required are 
increasing rapidly. 


Delivery of healthcare information in the NHS 
Healthcare information in the NHS is provided by a 
wide range of libraries, information departments in 
districts, units, and regions, and more recently by the 
newly emerging purchasing intelligence units. Other 
sources of information include drug information 
services, local authority and government departments, 
commercial and industrial information services, and 
university, college and public libraries. 

Jane Holdsworth in her report The provision of 
healthcare information in the UK: summary report 
provided a comprehensive view of the structure of 
library-based information supply in the NHS. 
Traditionally, libraries were seen as serving only 
clinical staff but as a result of recent NHS reforms 
libraries are now providing a wide range of services to 
a much greater cross section of the NHS, including 
large numbers of paramedic staff (from dentists to 
occupational therapists), nurses, health promotion 
staff, estates and support staff, managers and patients. 

NHS libraries have a long tradition anda reputation 
forthe provision of scientific and medical information 
to support nurses, clinicians, students and academic 
staff. Now libraries are meeting further challenges 
with a need to provide different types of information 
to a widening group of users. Particularly significant 
at the present time is the need to extend the proactive 
and evaluative skills that are required to provide 
information support to managers. 

During the early 1980s completely new and 
independent information departments were set up in 
District Health Authorities. This was in response to 
the increasing use of information technology, the 
implementation of the Kórner recommendations, and 
the implementation of the recommendations of the 
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joint NHS/DHSS groups on performance indicators. 
Districts reviewed their corporate throughput activity 
data and it became obvious there was a need for new 
technology to handle and disseminate the vast amounts 
of corporate activity dataj but the methods necessary 
were not clear. Following Kórner, many senior 
managers realized that a longer term strategic approach 
to information (that is, mainly numerical data) was 
necessary in order to ensure that the required data was 
produced and defined in a standardized form. The 
difficulties in cutting across traditional organizational 
boundaries and linking data produced by one 
department with that of another has remained a 
problem. Fully integrated information systems have 
still to be achieved. 

It is usually assumed that the Kórner reports 
concentrated entirely upon activity data relevant to 
the management of the health service. But it is 
worth recalling that one of the Kórner documents 
Converting data into information —which incidentally 
included proposals on the establishment of district 
information services — proposed that an ideal 
information service would: 

ө be a repository for statistical data collected 
locally within and outside DHAs, as well as 
data produced regionally and nationally 

e collect internally and externally produced 
documentation on policy, planning and research 

e be able to collect data on an ad hoc basis 

е be able to analyse statistical data and present 
relevant information drawn from different 
sources to meet the requirements of 
management. 

By 1985 the majority of Districts in England had 

appointed or planned to appoint an Information Officer. 

Since 1985 District (or District/purchasing 
authorities) information departments have grown 
greatly in recent years. Very few librarians have 
been employed in these information departments, 
although many of the skills of information scientists 
and librarians are relevant to the work of health 
authority information departments. The work of 
information departments has been guided by the 
1986 National strategic framework for information 
management in the hospital and community health 
services issued by the Information Management 
Group of the NHS Management Board, which also 
produced a Framework for information systems 
(Working Paper No. 11 of the Working for patients 
White Paper), and supported by a whole variety of 
` prototyping projects for information systems 
developments such as DISS (District Information 
Support Systems), HISS (Hospital Information 
Support Systems), the Open Systems Interconnection 
(OSI) project, and more recently DISP (Developing 
Information Systems for Purchasers) The NHS 
Management Executive Information Management 
Group will issue shortly an updated IM&T 
- (Information Management and Technology) Strategy, 
which will replace the 1986 strategy. 
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: One of the reasons for the neglect of libraries, 
librarians, and information scientists by this part of 
the NHS information revolution can be explained by 
the emphasis upon numerical as opposed to textual 
data in the White Papers and NHS and Department of 
Health information plans and strategies. However, as 
health authorities have been charged with assessing 
the healthcare needs of their resident populations and 
as the supporting infrastructure by the Department of 
Health has developed (which includes DISP, and the 
Purchasing Intelligence Movement), the value ofsome 
of the more traditional and established skills of 
librarians has become apparent. 

Purchasing intelligence is the phrase used to 
describe the information needed by purchasing 
authorities (ie. DHAs; FHSAs, and GPFHs) in support 
of their new contracting functions. The information — 
or ‘intelligence’ — covers all types of information, 
bothtextual and numerical. The NHSME DHA project 
team document Purchasing Intelligence emphasizes 
the need to integrate all types of information including 
routine, ad hoc, local, and national textual and 
numerical information, which will be needed to support 
healthcare needs assessment, service evaluation and 
monitoring, as well as contracting and management 
procedures. The document goes on to suggest that 
four aspects of the intelligent function must be 
addressed: 

e skills — it is necessary to coordinate the 
traditional information skills of the librarian 
with statistical information skills 

ө alliances — links with local organizations to 
maximize local resources 

e rules to determine the purpose and limitations 
of the information needed 

e facilities and technology — to make use of 
library technology in accessing, assembling, 
analysing, and interpreting information. 

Libraries and information departments have 
traditionally developed completely separately in the 
NHS, and generally do not cooperate or integrate 
their services. This has inevitably led to a duplication 
of both services and resources. The coordination of 
professional information expertise already available 


' inDistrictsand Regions must be utilized and enhanced 


through training. À team approach to intelligence 
gathering is likely to have the most success. There is 
a great danger of 're-inventing the wheel' being 
repeated in Districts and Regions, if information 
departments and libraries do not cooperate. 

An example of the benefits to be had from a team 
approach can be seen in one of the pilot sites of the 
Developing Information Systems for Purchasers 
(DISP) project in Cambridge DHA Information 
Services Unit. A multi-disciplinary team provides 
library, contract information, health information and 
IT functions. A Health Information Workshop has 
been developed to run on networked PCs. It provides 


' an integrated health information resource including 


access to management information systems, a 
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combination of national, regional and local datasets, 
and textual information such as library catalogues, 
journal articles, and full-text transmission of articles 
from major journals. The Cambridge team include 
information officers, statisticians, information 
scientists and librarians, epidemiologists, and 
researchers. They have developed links with other 
local organizations (eg. local authorities), RHAs, 
FHSAs, and national and quasi-national institutions 
such as the King’s Fund Centre, the Nuffield Institute 
for Health Service Studies, and the British Library. 
This ensures specialist backup information and 
document resources. 

There are many other aspects to the NHS 
information revolution: they are too numerous to 
discuss in detail, but in passing one can note the 
recent work on standards, coding and networking, 
and staff training development. All these aspects 
have implications for the future involvement of 
librarians and information scientists to a much 
greater degree than hitherto in information work in 
the NHS. 


The way forward 

In addition to the various initiatives considered, there 
is still a pressing need for a central body to take lead 
responsibility and develop a strategic development of 
coordinated national healthcare information provision, 
including all types of information— whether textual or 
numerical. This role could be filled by the Department 
of Health and the NHS Management Executive 
working together with the existing quasi-national 
information services, and the royal colleges, 
postgraduate schools, regional librarians, together with 
outside expertise from the British Library. 

In his concluding remarks to the British Library 
R&D Department Seminar held in July 1992 Health 
Care Information in the UK, David Russon, Director- 
General of the Document Supply Centre, British 
Library stated: 

For its part, the British Library will follow-up 

these issues with the Department of Health 

and the NHS in order to capitalize on the clear 
message of the Seminar, and it will continue to 
provide leadership and support by making 
sure that its services are better articulated 
with existing and future services in the 
provision of health care information. 

David Russon, 1992 

Existing NHS libraries are likely to play a much 
greater role in the NHS information revolution than 
hitherto, and librarians and information scientists 








librarians and information scientists are employed 
in these departments. Since the NHS has a skills and 
manpower shortage in information management, it 
is inconceivable that librarians and information 
scientists will not be employed in significant numbers 
in the future. 
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Application of modern technologies in health science 
libraries in India: a survey 


Dr RP Kumar 


All India Institute of Medical Sciences, New Delhi 


Abstract 

India is one of the oldest civilizations with a kaleidoscopic variety and rich cultural heritage. It has achieved multi- 
faceted socio-economic progress during the last 43 years of its independence. As the seventh largest country in the 
world, India is well marked off from the rest of Asia by mountains and the sea, which gives the country a distinct 
geographical entity. India comprises twenty five states and seven union territories. 

India has made commendable progress in the technological, engineering and communication fields. Modern 
technologies are applied to information handling. Production of hardware and software technology is domestic. 
National resources are augmented by establishing links with the international systems, 

There are 106 medical colleges, and 40 dental colleges in India. Besides this, there are nursing colleges, 
pharmacy colleges and other institutions. Each college/institution has a library of its own attached to it. The 
libraries can be classified into Medical, Research, Ayurvedic, Homeopathic, Dental, Unani and Pharmaceutical 
Libraries. 

A survey was carried out on the usage of modern technologies in health sciences libraries eg. photocopiers, 
microfilming, computers, facsimile transmission, audiovisual, online searching and CD-ROM in the form of a 
questionnaire, Personal visits were made to a number of libraries and some of the librarians were also interviewed. 
This paper examines the impact of modern technologies on medical libraries, and concludes with the problems faced 
by the librarians in adopting the modern technologies and suggests the need and measures for implementation of 


modern technologies to health science libraries. 


Introduction 

Health professionals and practitioners engaged in the 
task of improving the health standards of the Indian 
people, need an efficient information support system 
so that they can deliver the health care services 
effectively. This support becomes much more relevant 
in the national efforts to achieve the goal ‘Health for 
All by 2000’. Recent developments in computers and 
telecommunication technologies have revolutionized 
the modes and methods of information storage and 
retrieval. Now information cannot only be stored, 
retrieved, communicated and broadcast electronically 
in enormous quantities and at phenomenal speed, but 
it can also be rearranged, selected, marshalled and 
transformed, Any sequence of operations on 
information can be carried out without further human 
intervention, 

The developments in telecommunication and 
microcomputer technology have taken a quantum 
leap. In India, the relevant technologies are being 
acquired from developed countries under the 
Government of India policy in the required areas. The 
level of expertise, competence and knowledge in the 
field of computer communication and networking 
software is raised sufficiently to affect various methods 
of communication within India and outside India. 
Commissioning of an International Gateway Pocket 
System (GPSS) at VSNL, New Delhi and Bombay 

‘has facilitated cheaper and easier communication 
with online search services. Over 4000 databases are 
available now, which can be accessed online through 
locally available communication networks. 


Furthermore, over 800 bibliographic, factual and 
textual databases are now available on compact discs 
which allow local access and hence do not involve 
communication cost. 


Scope 

This study attempts to examine in detail the use of 
modern technologies in health sciences libraries, These 
technologies are telecommunication, photocopying, 
micrographics, audiovisual, computer applications, 
online and CD-ROM etc, It also aims to pin-point the 
need for development of these modern technologies 
appropriately. 


Methodology and material 

A questionnaire was prepared and sent to different 
libraries in Delhi. A total of 54 responses have been 
analysed in this study. Initially it was planned to 
survey all the health sciences libraries in India, so the 
questionnaire was sent to about 700 libraries. But for 
the scope of this paper it has been restricted to the 
health sciences libraries in Delhi only. The medical 
librarians in Delhi were also personally interviewed. 


Some basic facts about India 

India is a country of 25 states and seven union 
territories. The executive head of the Indian union is 
the President. 


Physiography 
India is the seventh largest country in the world. The 
vast size of India needs no stressing. Reaching from 8 
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4' to 37 6' North latitude and from 68 7' to 97 25' East 
longitude it measures some 3200 kms from North to 
South and some 2980 kms from West to East, covering 
an area of 3,276,141 sq. km (including Sikkim). 


Climate 

The vastness of India is reflected in the wide range of 
climatic types to be found within it. In the plains, the 
desert of Rajasthan contrasts with the humidity of 
Bengal. The winter snows of the Himalayas contrast 
with the nearly equatorial heat of Kerala and 
Tamil Nadu. 


Medical education in India 

The present medical system of education owes its 
debt to the British traders who brought with them 
medical teams trained in the western system. Before 
this, the indigenous systems of medicine such as 
Ayurvedha, Unani and Sidha were in practice. 

A large number of medical colleges were estab- 
lished after independence and the task of the Bhore 
committee was to propose steps to remodel medical 
education in the country. There are 466 medical and 
allied health colleges and schools imparting education. 

There is a force of 651,398 practitioners both in 
modern and Indian medicine in India, operating from 
5,568 primary health centres and 51,192 sub-centres. 

There are two health sciences universities in India. 
Andhra Pradesh was the first to set an example by 
establishing the first health sciences University in 
India on first of November 1986. The state of Tamil 
Nadu was the second to start a university. 


Medical libraries in India 

Thereare about 700 medical libraries in India attached 
to medical, nursing colleges and district hospitals. 
These libraries are located in different organizations 
engaged in research, education and training, 
administration and programme implementation 
relating to health and family welfare. 


Medical libraries in Delhi 
The 54 libraries which form part of this study can be 
categorized as follows: 
1) International 
2) National 
3) Academic 
4) Research 
5) Other System of Medicine 
6) Hospital ' 
7) Auxiliary Services 
8) Pharmaceutical 
9) Voluntary Associations 


l. International 

1.1 WHO. Regional Officefor South-East Asia Library 
This library is primarily meant to provide reference 
services to WHO staff in the regional office and field 
staff in the country of the region. It also provides 
reference services to medical and allied bealth scientists 


64 


in Delhi as appropriate and within the limits of its 
resources. It provides free of cost MEDLARS/ 
MEDLINE searches and photocopies of references 
nct available in the country through the HELLIS 
nétional focal points. The library has a computer, 
m.croforms and CD-ROM. There is no audiovisual 
section. Some of the inhouse activities are 
ccmputerized. It has its own telephone but uses the 
centralized telex and fax facilities. As mentioned 
above, it has photocopiers also. 


2. Libraries of national status 

2." National Medical Library 

It was started in 1926 as the library ofthe Directorate- 
General of Health services and was designated National 
Medical Library only in 1966. The library collection 
totals 245,000 volumes and it receives 2065 periodicals 
annually. It has two direct telephone lines, and can 
use the DGHS (Centralized) facility for telex and fax. 
It nas four photocopiers, one reader-printer and ап 
audiovisual section with 50 video-terminals and uses 
the LIBSYS software package for cataloguing and 
incexing/abstracting. It has CD-ROM and databases 
Ше? MEDLARS and EMBASE. 


2.2 All India Institute of Medical Sciences Library 

The library was established іп 1956 and caters to the 
needs of teaching, research and patient care. It has a 
total collection of 118,000 volumes and receives 
about 500 periodicals annually. It has telephone 
cornections, as well as E-Mail facilities through 
DELNET, and it can use centralized telex and fax 
fac lities. There are three photocopiers, two microfilm 
readers and one reader-printer having 1556 microfiche/ 
microfilms. The library is equipped with 3 VCRs, 
3CTVs and has 115 video cassettes. The library has a 
PC-AT with fourterminals and uses software packages 
suca as LIBSYS, CDS/ISIS, Wordstar, Lotus and 
Fortessy. It has automated acquisition and serial 
control, and conversion of the card catalogue to 
machine readable form is in progress. There is provision 
to a>quire a CD-NET workstation shortly with 8 nodes. 


2.3 National Institute of Health and Family Welfare 
It vas established in 1977 and receives about 500 
per»dicals annually. It provides information on 
population, communication, health education and 
hea.th administration. The library has telephone 
facilities but does not have access to telex, fax or 
e-mail. Ithastwo photocopiers, one microfilm reader- 
printer and 5000 microforms. It has a PC-XT and uses 
CDS/ISIS for press-cuttings and abstracting. It 
possesses both a CD-ROM drive and has access to the 
Pop.ine databases. 


2.4 Indian Council of Medical Research 

The Indian Council of Medical Research was 
established in 1911. Over the years, it has built up a 
collection of research reports published by the council, 
and ^as a total collection of 30,000 volumes and 150 
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periodical titles. Although this is a library of national 
status it does not have any modern technology — not 
even direct telephone lines. Its searches are done 
‘through the National Informatics Centre and the 
National Medical Library. 

The ICMR/NIC Centre for Biomedical 
Information, established in 1985-86, is working on 
developing discipline-oriented bibliographic databases 
on Indian publications in priority areas of health/ 
medicine. The centre has established a link through 
the international Gateway Packet System of Videsh 
Sanchar Nigam Ltd, with the National Library of 
Medicine (NLM), Bethesda, USA and currently 
performs 150-200 searches per week. The centre also 
has mini, micro and personal computers and uses 
software packages like CDS/ISIS, BRS etc. It has 
access to nearly eighty different databases on different 
subjects, three separate CD-ROM units and CD-NET 
which supports fourteen drives. It performs searches 
for doctors and researchers etc. 


2.5 Institute of History of Medicine and Medical 
Research 
The Institute is popularly known as the Hamdard 
Institute and was established in 1961. It has a training- 
oriented section which has books in Urdu, Persian and 
Arabic. These books deal with a wide range of subjects 
in the field of Unani Medicine. The library does not 
have any of the modern technologies listed in the 
questionnaire. 


2.6 Central Council for Research in Ayurvedha and 
Sidha 

The library of the council is still developing. There is 

a documentation section also. This library also does 

not have any of the modern technologies listed in the 

questionnaire. 


3. Academic libraries 
The following three institutions offer both graduate 
and undergraduate courses in modern medicine in 
Delhi. They are: 

1, Lady Hardinge Medical College 

2. Maulana Azad Medical College 

3. University College of Medical Sciences 

The total collections of documents in these 
academic libraries vary from 20,000 to 50,000. The 
libraries have telephone and photocopying facilities, 
but none of them have a microfilming section. Only 
the Maulana Azad Medical College library has an 
audiovisual section with 95 videocassettes. These 
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4.1 National Institute of Immunology 

The library of the Institute, though it started functioning 
only in 1982, is very rich in modern technologies. It 
has direct telephone lines, but uses centralized telex 
and fax facilities. The library has three photocopiers. 
Ithas micrographic facilities and is equipped with one 
reader-printer, having 375 microforms. It has an 
audiovisual section also. There is one PC which 
generates the current awareness service and a list of 
reprints of NII scientists. The software packages 
available are CDS/ISIS, Wordstar and Fontessy. The 
library has two CD-ROM drives and access to 
databases like the life sciences collection of CSA and 
POPLINE. The library sends its MEDLARS search 
request forms to the NIC. 


4.2 National Institute of Communicable Diseases 
This institute was established in 1963 and has a total 
collection of 26,000 volumes with subscriptions to 
160 journals annually. The library has a telephone 
line but does not have access to telex, fax or e-mail. It 
has only one photocopying machine but no 
micrographic facilities. It also does not have any of 
the other modern technologies. 


4.3 Vallabh Bhai Patel Chest Institute 

It was established in 1953. The total collection is 
30,000 and receives 180 periodicals annually. The 
library does not have any technology except the 
telephone. It sends MEDLARS search requests to 
the NIC. 


4.4 Institute of Nuclear Medicine and Allied Sciences 
This is a very specialized institute in India. The 
library provides current awareness and a selective 
dissemination of information service. The library has 
automated some of its activities. They also send their 
MEDLARS search requests to the National Medical 
Library and National Informatics Centre. 


4.5 Defence Institute of Physiology and Allied 
Sciences 

It was established in 1962 and the total collection of 
the library is approximately 8,000 volumes, with 75 
journals subscribed to. The library has a direct 
telephone line, but little else in the way of modern 
technology. 


5. Other Systems of Medicine 
Under this category fall the following two institutions: 
1, Ayurvedha and Unani Medical College 
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Dr Ram Manchar Lohia Hospital 
. GB Pant Hospital 
. Kalavati Saran Children’s Hospital 
. Safdarjung Hospital 
Kasturba Hospital 
. Holy Family Hospital 
. GM Modi Hospital 

10. Army Hospital 

All these hospitals are imparting postgraduate 
training in various disciplines in the health sciences. 
The libraries attached to these hospitals provide 
traditional library services. Unfortunately none of 
these libraries have thought of.using modern 
technologies for the libraries, but we must remember 
that some of them are of very recent origin. 
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7. Auxiliary services 
Some of the institutes in Delhi also provide training 
for auxiliary staffin health science. The most important 
of these institutions are: 

1. Raj Kumari Amrit Kaur College of Nursing 
. Institute for the Physically Handicapped 
College of Pharmacy 
Hamdard College of Pharmacy 
Lady Irwin College 
. Institute of Health and Hygiene 

АП these institutes һауе reasonably good libraries, 
but perform traditional library services. They do not 
have modern technologies for libraries. 
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8. Pharmaceutical libraries 

The questionnaire was sent to twelve pharmaceutical 
libraries, butonly nine ofthem responded. The libraries 
which responded have telephone lines, photocopiers 
and computers. They use centralized telex and fax 
facilities, but none of them have CD-ROM and online 
search facilities. They are using computers for 
automating inhouse activities and providing current 
awateness services. 


9. Libraries of voluntary associations 

There are twelve libraries attached to voluntary 
associations. Because of inadequate budgets, they are 
not in a position to apply any modern technologies. 


Analysis of health science libraries 
The use of technology in libraries is analysed below, 
although the WHO library is excluded. 

1. Some of the libraries have automated their 
library routines, and the majority of them are 
planning to use them. 

2. Only five libraries have an audiovisual gestion 
in the library. 

3. Onlyfourlibraries have micrographic facilities. 

4. Onlysixlibraries have CD-ROM and databases 
on CD-ROM. 





6. Hospital libraries are poorly developed and 
are quite backward, so far as modern 
technology application is concerned. 

7. Only eight libraries are members of DELNET. 

8. Only one health science library in Delhi has e- 
mail facilities, but the others are planning to 
acquire it. 


Reasons for poor technology application 

As mentioned earlier, the methodology of the survey 
was theuse ofa questionnaire and personal interviews. 
Thz investigator had personal discussions with twenty 
libarians in Delhi. The problems faced by the health 
sciznces libraries in Delhi are not much different from 
those faced by libraries in other parts of the country. 
The following are some of the reasons for slow 
application of modern technologies: 

1. Insufficient funds; 

2. Lack of adequate trained staff; 

3. Lack of proper in-service training for their 

adoption; 

4. Lack of initiative on the part of library 

professionals; 

5. No support from the authorities; 

5. Lack of interest on the part of readers; 

7. Poor telecommunication systems in the 

country; 

3. Lack of scope for librarians in health sciences 
libraries in India to perform their duties 
independently. The library is not recognized 
as an important component of the institution. 

. Librarians of various category function under 
different administrative agencies in which there 
is no unanimity. 
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Government initiative for technology development 
Mocern technologies are the nervous system of a 
nation. Realizing the importance of technology, 
especially since 1983, the Government of India has 
{акел certain bold initiatives. A package of measures 
was announced on August 18, 1983 by Dr NS Sanjeevi 
Rao, then Deputy Minister in-charge of electronics. 
The package stressed the need for promotion, rather 
than regulation, and reduction of input costs to achieve 
economy of scale. Critical areas such as silicon, 
microwave tubes, LSI/VLSI circuits and R&D for 
electconic switching were identified as investment 
areas by the Government, and it has been decided to 
give due attention to the growth of materials and set 
up a National Silicon facility. In the computer segment, 
increasing emphasis has been placed on the application 
of computers in education. With the introduction of 
the national hook-up TV, the broadcasting network 
had teen expanded to cover seventy per cent of the 
popu ation, as compared to thirty per cent earlier. 
NICMET and INDONET are functional. 
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electronic - are likely to operate alongside each other 
for at least a few years to come. In the first instance, 
the sale of printed material will subsidize the 
development of document delivery systems that print 
on demand, Over a period of time, the balance will 
shift to electronic document delivery systems. 

The author would like to make the following 

suggestions: 

1, All the health sciences libraries should have 
telephone, telex, fax and e-mail facilities. 

2. There should be between one and four 
photocopiers depending upon the size of the 
libraries. | 

3. The libraries should have an audiovisual 
collection. 

4. The medium-sized libraries should have a 

` microfilming unit. 

5. Smallhealth science libraries with a collection 
of about 25,000 should have a PC-XT. 

6. The medium-sized libraries, with a collection 
ranging from 50,000 to 200,000 should have at 
least one PC-XT with a CPU of 16 bits, 512 Kb 
main memory, 2 Winchester discs (40Mb 
each), 4 terminals, one floppy drive and one 
200 CPS dot matrix. 

7. A National information/documentation centre 
for health sciences should be established in 
Delhi similar to that of INSDOC, DESIDOC, 
SENDOC, etc, which would be equipped with 
allthe modem IT equipmentand would network 
with all the local libraries. 
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8. There isa need to make librarians aware of the 
latest technology so that they can impress 
upon the authorities the advantages of applying 
the latest technologies. 
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An Author at play in a computer-simulated world 


Jane Dorner 
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Abstract 


This paper suggests that computer manipulation of written material is insubstantial. Despite the huge choice of 
electronic writing aids, few match up to what is required by professional writers, editors or translators. The creation 
of texts on screens further desubstantiates the written word and forces copyright owners to redefine the ‘copy’ as a 
unit of sale. One of the many challenges facing those involved in publishing is to make provision for the implications 


of presenting and displaying work in electronic form. 


1. Introduction 


If there is such a thing as virtual reality, which is a | 


computer-driven picture of the real world—apparently 
similar, but lacking the substance of the real thing — 
then, I would like to argue, there is virtual writing. 

Real writing is what a person can understand, 
using anything from intuition to prejudice to sift out 
meanings from ambiguities. A person will almost 
always interpret — however garbled — the syntax and 
grammar, however difficult the substance. Virtual 
writing is what a computer pretends to understand. It 
is a computer picture of the written language using 
algorithms and pattern-matching to find, display or 
analyse strings of text. This offers a play area in which 
thecomputeris being *trained' in three broad directions 
that are of interest to the writer: 

l. to offer shells for massaging creative ideas 
into literary forms (random plot generators, 
poetry processors, crossword puzzle makers); 

2. to supply the need for immediate interactive 
reference (pop-up dictionaries, translation 
thesauri, CD-ROM encyclopedias); 

3. to act as an interactive editor or translator by 
apparently making judgements based on 
internal program algorithms (grammar- 
checkers, style analysers, automatic translation 
programs). 

In the research for my recently published book 
[Dorner (1)], I have played with 50-odd computer 
toys, some briefly and others in considerable depth. 
Many of these offer themselves as serious computer 
tools for professionals, with price tags to match (I 
estimate that the total value of software I have had for 
review is £5,000, exclusive of word-processors). 
Others have an educational value but are very 
constricted by memory limitations. Producers 
obviously see this asa wide open market. The last two 
years has seen an enormous growth of such products; 
some new, others upgrades of previous products. Two 
years ago, a survey showed that tools such as these 
were in use by between 196 and 996 of professional 
authors [Dorner (2)]. This growth must be aimed at 
cost cutting. Figures released last year show that in 
most offices a standard letter costs a company £3.50 a 


page and documents originating from senior heads of 
department cost £50 a page (paper, printing ink, 
overheads, writing, reading and editing). Computer 
tools aim to help cut costs by reducing time spent in 
research and quick reference and by making it easier 
to attend to style on screen rather than on paper. 


2. An Electronic Workshop 

Those involved in writing lag behind musicians and 
composers who have embraced the computer for its 
creative and labour-saving opportunities. Our 
equivalent would be a multi-screen workshop where a 
writer could swivel round in a semi-circle to see five 
or six screens monitoring text at different planning, 
writing or editing stages — a visual synchrony of the 
mental processes that occur during writing. Imagine 
being able to turn to — or ignore — that screen which is 
checking your grammar, or showing the structure of 
your ideas, or providing keyword links to everything 
else youever wrote on the same subject, or identifying 
changes made since the last draft, or simultaneously 
translating the work into five different languages. In 
this picture the writer does what the musician does 
with sound; knob-twiddles with words and ideas. In 
this way authors can play when inspiration lags, 
while accepting that final control always lies firmly 
in the operator's hands. It's just possible that too 
much dependence on the computer would be 


' unhealthy, but how much that is to do with 


ergonomics and safe usage of these electrical toys is 
not under scrutiny here. 


3. Rules and the Game 

In virtual writing you simulate real writing just as in 
virtual reality you could simulate real tennis. The 
game offers a complete framework in which the text 
is bound by a large number of rules. These can be 
obeyed or broken or changed to fit a specific need. 
Suchrulesare computer-agitated — shaking the subject 
material until it fits the slots available, like ball- 
bearings rolling over a perforated board. There are no 
winners or losers; players are trying out their texts in 
this world of rules to see how they look or what can be 
learned from the process, much as an architect would 
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rotate a computer-modelled house to explore uncharted 
comers before going back to the drawing-board. 

The scoring in this game of virtuality is not numeric; 
it is the more subtle effect of the human player 
seeking to prove superiority over a digital foe. There 
is something about the way in which the computer 
reduces everything to an ephemeral series of pixellised 
graphemes, in which on and off light switches provide 
the only substance, that requires a special alertness on 
the part of the writer. One of the reasons for the 
success of the computer as a modern linguistic tool 
must be that human beings, in their arrogance, enjoy 
pitting their wits against it. The attraction, perhaps, of 
machine-assisted translation is not so much the time 
saved in situations where vocabulary is strictly limited 
— e.g. standard letters, business reports, technical 
descriptions, business proposals, sales literature, bills 
of lading, instructions and directions — but the pleasure 
of post-editing and improving on the computer’s best 
guesses. A simple example from the hard-disk-hungry 
program Globalink is the German ‘In den letzten 
Jahren’ which translates as ‘in the last years’ rather 
than ‘in recent years’. 


4. Fun Element 

Scoring, of course, is not necessary in a simulation 
because the playing itself is a prop to learning. The 
modern trend in computer-driven writing makes such 
learning fun, with pretty pictures egging the user on to 
new discoveries about what each bit of software can 
do. People worry about the increase of this Noddy 
element on computer screens. The Delaware University 
research project — a project largely discredited by the 
academic community — that examined students' 
literacy level and choice of ‘serious’ subjects according 
to whether they used a Macintosh or an IBM running 
Dos apparently *proved' that the friendlier icon 
environment fostered fluffier subject material and 
lower reading levels in the writing. 

The worry is that, in Pope's phrase, ‘Amusement 
is the happiness of those who cannot think' (and for 
the record, I found that epigram in my pop-up 
dictionary of quotations, sliding it into my text like 
jelly off a hot spoon). Pope, of course, is a moralist, 
but perhaps there is a case for more serious study on 
the effect of the amusement value of computer-aided 
writing on the individual than the Delaware project. 
My hunch would be that the debonair point-and-click 
flirtation ends with the print-out, when the reality of 
the marriage reasserts itself. It doesn't take long to 
learn the acronym GIGO - Garbage In, Garbage Out. 

We need to decide if we can expect the computer 
to help writers, in the same way that virtual reality 
helps pilots to fly planes, or architects to design 
buildings. If it made a significant contribution of our 
understanding ofthe way language works, that would 
seem to be an achievement, However, it isn't clear to 
me, yet, that all the toys and gadgets I have looked at 
give us anything more thana very partial understanding 
of language. Sir Randolph Quirk, commenting on 
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papers in a symposium on directions in corpus 
linguistics [Svartik (3)], reinforces a cautionary 
attitude. There is, he says: 
a certain tension between (a) those who 
want to know as much as possible about 
language (and God bless the computer if 
it can help), and (b) those who want to 
know as much as possible about what the 
computer can do (and God bless it if this 
advancesourknowledge about language). 


Returning now to my three bands of computer 
help, can we ‘bless the computer’ at all for the help it 
offers? Names of products that I have played with are 
given in italics within each category. 


5. Structural Help 
Outliners, brainstorming and various kinds of 
conferencing software are in business usage. Some 
writers prefer to use an outliner to establish a structure 
and then flesh this out. There is a fundamental 
assumption behind the concept of an outliner that is 
open to question. It is based on the claim that good 
prose is a consequence of planned rhetorical 
organisation: you do your thinking first and then 
plaster in the gaps. Set against that is the view to 
which I subscribe — good prose is a consequence of 
spontaneous discovery: you paint in the broad 
brushstrokes and find out what you want to say as the 
colour spreads. A colleague describes this as the 
‘romantic approach’ as opposed to the ‘classical 
approach’ favoured by software grids [Galbraith (4)]. 
Controversial (and as yet unproven) as such a theory 
may be, it does at least have interesting implications 
for the design and use of software tools like outliners. 
Anything that offers a prompted system for helping 
the writer needs reviewing with some caution. It may 
be that outliners are useful for reorganising writing, 
but have serious limitations in the generation of ideas. 
The kind of thing virtual writing requires is that a 
poem, a play or a genre novel writing can yield to 
algorithmic analysis: e.g. the story is the problem, the 
computer program offers the procedure for solving it. 
The simpler the medium, the more likely this is. 
Haiku can be effectively computer-generated. It is 
not obvious whether a person or a computer had 
written this: 


All green in the leaves 
І smell dark pools in the trees. 


Crash, the moon has fled. Masterman* 


Random plot generation is another matter. 
Exploring a first thought that may have been 
randomly generated by typing answers to a 
questionnaire or clicking a mouse on a ‘box’ of 
themes is one way of thinking creatively. In creative 
writing classes (as in art and design) much is made 
of the value of developing random ideas and taking 
them beyond themselves. As an initial stirrer of the 
pot of ideas, a computerized story generator has as 
much to offer as any other technique, and very often 
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more because it offers a playful exploration of the 
structures and processes of written language. 
However, given the complexity of human fictions, 
handling them scientifically is bound to be 
extraordinarily difficult. I've played with a few of 
these things and am provisionally prepared to declare 
that as a writer, I have no interest in them. They seem 
to me mere toys offering stereotype solutions and 
unimaginative combinations. They probably have their 
place in education, because in playing ‘what if? games 
a student learns to define ‘what happens next’. On 
balance, though, I see no future for literature in 
structural tools. 
Plots Unlimited, StorySpace, AIQ, Newman’s Poetry 
Processor, IdeaList, EndNote, Macrex, GrandView, 
Brainstorm, Thoughtline. 


6. Pop-up Help 

The second camp of linguistic interest is directed at 
converting paper books into efficient online reference. 
Ido not mean multilingual databases such as the EC’s 
Eurodicautom in Luxembourg, but more home-based 
products that replace reference books. The reduction 
of the 20 volumes of the Oxford English Dictionary 
2nd edition or Grand Robert Electronique to a thin 
laser-disc is a modern miracle. They cost less and are 
more useful than the paper equivalents because of the 
magical sophistication of the search routines. Less 
ambitious products comprising pop-up dictionaries of 
usage, quotations, reference books, bibliography lists, 
multilingual dictionaries, and a thesaurus or two 
illustrate a current growth area. Success here depends 
on software and hardware engineering rather than 
linguistic expertise. Most of these lurk in background 
memory and can seriously reduce the amount of text 
you may hold in active memory. 


It is only a matter of time before computer 
memories are large enough and CD-ROM general 
enough for the pop-up complement to virtual writing 
to be commonplace. It will feel insubstantial to be 
checking a word-derivation or quotation or cliché via 
a series of light switches rather than hefting down a 
volume and thumbing its crisp white pages, but we 
will get used to it. I’m not sure that we should get used 
to it, because keystroke reference reduces the writer’s 
mobility and courts the dangers of eye-strain, back- 
ache and repetitive strain injury. 

I don’t quarrel with Isaac Asimov’s description of 

the ideal medium for information. It should, he thought, 
be optimally ordered, the medium should have a low 
energy requirement and should be portable. The 
acronym for that comes from Bound Optimally 
Ordered Knowledge— BOOK. For some reason we’ve 
given this the retronym ‘hard copy’. 
The Oxford Writer’s Shelf, The Oxford Science Shelf, 
The Concise Oxford Dictionary, The Oxford English 
Dictionary (CD-ROM), The Oxford Thesaurus, 
Complete Writers’ Toolkit, Lexica, Random House 
Encyclopedia, Quotemaster, Correct Quotes. 
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7. Stylistic Help 

Finally, in my last band fall usage- and grammar- 
checking programs and style analysers, which must 
underly mechanical translation. Here, software 
linguists have serious difficulties in playing off real 
live understanding against computer simulation. 
Clearly computers can only look at the structure of 
language without grasping its meaning. Chomsky’s 
famous colourless green ideas that sleep furiously 
would pass muster in any of the available checking 
packages. One must accept that computer products 
will be limited in their efficiency and use them as part 
of the process of critical reading. This is where the 
game factor comes in. It is more fun to sit clicking 
buttons, accepting suggested changes and ignoring 
inappropriate ones than to read yourself. Nothing can 
replace intelligent reading, but software may enliven 
the editing task. 

The challenge for program developers is to discover 
how you isolate the circumstances under which a 
computer can offer reliable advice on style and usage. 
Presupposing a market largely made up of business 
users makes the task a little easier. It is possible where 
error-trapping is reasonably trivial, and this in itselfis 
of value to a writer-editor whose attention may 
inevitably nod. Thus it is easy enough to net a phrase 
like ‘of that ilk’ and a reasonable guess to assume that 
the writer doesn't know what 'ilk' really means. 
Software limitations, as well as a preconceived notion 
of audience interest, require disposing of this in two 
lines, but Philip Howard, іп 4 Word in Time, has three 
pages on it. Here, as in so many usages, itis becoming 
so universal in its ‘incorrect’ form that only purists 
will object. 

Software linguists face a dilemma. Do you leave 
outall such usages so that the writer is never alerted to 
a possible area of concern? Or do you tuck it away 
with some bland expression like, ‘Avoid this phrase 
in formal writing'? Software that upbraids a writer 
too often is irritating and saps the confidence of 
inexperienced writers: software that fails to deal with 
one of the matters that can expose a writer to public 
scorn is unreliable and saps the confidence of 
experienced writers. 

I have yet to find a grammar-checker that really 
works, though some make brave attempts and some 
are available in dual language versions (e.g. checks 
your French grammar but gives explanations and 
tutorials in English). The writer at play does better 
with a concordance, though it is much harder work 
than flowing words through a grammar-checker 
because considerable thought goes into isolating what 
one wants to know. But it is an eye-opener to run a 
series of analyses and worth doing once in your 
writing life if only for the light it sheds on personal 
style: the variety of one’s own vocabulary; its musical 
flow; whether similar sentence characteristics occur 
across all one's writing; what weaknesses the relentless 
counting of words reveals. Such personally tailored 
analyses are of value to translators or originators 
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alike. A concordance program is one of the more 
interesting electronic games in the writer’s playground 
but is hard work to operate. 

Correct Grammar, Grammatik 5, StyleWriter, 
RightRighter, Macproof, TexAtrix, The Oxford 
Concordance Program, Docucomp, The Editor’s 
Assistant, Corporate Voice, Readability Plus, 
Globalink. 


8. Playing with Copies 

None of these categories excludes the other and they 
fit into a large library of tools for writers. In an ideal 
joint simulation they would link to each other. All 
would have access to the depths of pages of Howard, 
Fowler, Burchfield and Quirk et al hidden within the 
computer’s interior and providing discussions and 
cross-references at whatever level of detail required. 
Such a Fowler-in-a-box — to borrow Geoffrey 
Nunberg’s phrase [Nunberg (6)] — would require 
hypertext on CD-ROM with all its attendant copyright 
problems. How easy it then would be to assemble 
compilations plundered from other writers’ electronic 
commentaries. 

For in that word ‘copyright’, which we have come 
to regard as the raison d’être of all creative work, are 
a host of assumptions. At a basic level is the notion 
that a text (original writing or translation) is a ‘copy’ 
or saleable unit. And ephemeral as an electronic text 
may seem, it is still to be regarded in the same way as 
its ‘hard’ equivalent. Copyright exists as soon as you 
type the words even if you have the monitor switched 
offand cannot read them. (It doesn’t exist in a second- 
language text that is an -unaided translation by a 
computer program, but that is another issue.) 

However, computer programs such as the ones 
under discussion encourage the user to play with 


electronic texts; to manipulate, change, check and : 


analyse them. At the very least such manipulation is 
one way of animating another writer’s characters on 


screen: should the user then be paying for film rights?: 


This is a facetious point, but not as frivolous as it 
sounds, for the unfortunate truth is that there is no 
adequate provision for texts that exist in electronic 
form. Online databases and text storage archives do 
not have a collecting system in place whereby 
copyright holders are reimbursed for the copying on 
of their texts. The honour system still holds but 
whether it is useful in a world where the technology 
positively encourages people to patch, purloin and 
plagiarize is another matter, 

This aside, the computer tools in existence make it 
very easy for writers to supply machine-readable 
texts ready for use in.the publication process. But 
publishers have so far been slow to regard this as an 
advantage. Only a third of word-processing authors in 
a recent survey reported that their publishers were 
interested in using their discs [Dorner, (2)]. Moreover, 
the Minimum Terms Agreement negotiated with 
publishers by the Society of Authors and the Guild of 
Writers has no clauses recognizing the author’s right 
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to offer to the publisher electronic copy, nor discusses 
division of responsibilities or circumstances of use of 
such electronic copy, nor the subsidiary rights attendant 
on electronic copies. These are large issues, worth 
more attention than there is space for here, and 
introduced to indicate how ill-equipped we are to 
make provision for this world of virtual writing. 
And if this is a virtual world, why not add sound 
and moving pictures to pep it up? After all, virtual 
reality isan illusion, and if my parallel is to work then 
the audiovisual element is essential. But is my analogy 
a good one? I am myself divided. In some sense I feel 
that, limited though they are, the programs I have 
seen educate because they allow playtime with written 
language. The point-and-click magic of correcting 
sentences on the wing is both fun and face-saving. 
This is something we have been missing in library- 
book jeremiads. On the other hand today’s commercial 
products do seem tame. Maybe this will change in a 
few years’ time, when Janguage corpora (such as 
COBUILD and Lancaster/Longman) have analysed 
hundreds of millions of words and phrases to give a 
databank of current English usage. Then — given 
adequate memory requirements — automatic parsers 
may begin to look sensible. Then translation may be 
made easier. Until that time we must remember that 
computers are playing in a world of virtual writing — 
very like the real thing, but without its substance. 


Jane Dorner. All rights retained. No copyright in this 
text is assigned to any publication. 
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Abstract 


Two aspects of the selection and evaluation of software packages are reviewed: the strategy for evaluation and 
selection, and criteria that might be applied in selection. The evaluation and selection of a software package should 
be approached as a project. Appropriate strategies for the selection and evaluation of software packages can be 
based on information systems methodologies. The main stages in the project are: definition of objectives, evaluation 
of options, definition, selection and design, implementation and evaluation and maintenance. A system requirements 
specification is an important document in this process. Software selection must be guided by appropriate criteria. 
General criteria are cost, lifetime and life history, originator, supplier, support, maintenance, technical considerations 
and compatibility, ease of use, interfaces and integration. In addition specific criteria must be developed for specific 
categories of packages. A checklist of criteria for database packages is given. 


Introduction 

This paper focuses on the selection and evaluation of 
software packages. There are two main sections to the 
paper: the strategy or methodology that might be 
adapted during evaluation, and the criteria that might 
be applied in the selection of the most appropriate 
software package. Both of these issues are addressed 
from a practical angle. A much more theoretical 
approach could be taken, especially in the area of 
information systems methodologies, but this is not 
felt to be appropriate. 


Software packages 

The focus of this paper is the selection and evaluation 
of software packages. This process will normally take 
place as part of the selection of a complete computer 
system involving hardware and software, and the 
strategy for the complete system might follow a 
similar pattern to that outlined below. Nevertheless, 
this paper will concentrate on the selection of a 
software package. 

A further assumption underlies the comments made 
here. This is that for most applications software should 
be acquired in the form ofa prewritten, commercially 
available software package. It is possible to develop 
software in-house, although the design and 
implementation of in-house software definitely 
requires greater skills of systems analysis, 
communication, liaison and project management, than 
the acquisition and implementation of a pre-written 
package. A package, like all off-the-peg garments, is 
not tailor-made and cannot be expected to cater for 
every little idiosyncrasy in every application. Usually 
this is an advantage; it encourages users to adopt more 


should be robust, and supported by documentation, 
user groups and other users, training, help desks and 
maintenance arrangements. Such support is especially 
crucial for the newly computer literate information 
manager, but should also be welcomed by all who 
have to manage a computer system. 


À STRATEGY FOR EVALUATION AND 
SELECTION 

Given the range of options when considering a new 
computer systema strategy for evaluation and selection 
and for the management of the project is essential. 
Such a strategy can be exploited both to assist in 
choosing an appropriate software package, and to 
design the system that will be created with the software 
package. Analysis and data gathering associated with 
both of these activities can evolve in parallel. Any 
strategy that is adopted will identify a number of 
stages through which the project must pass. These 
stages will take time. Very often the time involved in 
theselectionand implementation ofa computer system 
is seriously underestimated leading to late 
implementation, unfulfilled expectations and other 
associated hazards. À small system should merit much 
less planning time than a large system, but planning 
remains necessary. 

The strategy which is proposed below assumes 
that the systems project evolves from application to 
software to hardware. In other words, the requirements 
of the application are identified, prior to the 
introduction of constraints that may be posed by 
hardware and software. In some environments this is 
an idealized model. Sometimes hardware has already 
been purchased, oritis desirable to adhere to externally 


Selection and evaluation of software 





Systems analysts have developed information 
systems methodologies to assist in the analysis and 
design of information systems. Whilst these 
methodologies are probably too detailed to be applied 
in the selection of a software package there are a 
number of characteristics of such methodologies that 
might usefully be integrated into any strategy for 
systems analysis and design. These are: 


1. The methodology specifies a detailed and clear 
series of phases, sub-phases and steps through 
which the project must pass. 


2. A top-down approach is taken which starts with 
the development ofa broad overview ofthe system, 
and then progressively more detail is added. 


3. One step in the methodology leads to the next, 
with increasing refinement. 


4. Quality assurance steps that must be completed 
are positioned throughout the project. 


5. Heavy reliance is placed on the systems 
specification which is drafted early in the project 
and revised as work proceeds and additional 
insights are gathered. 


6. User consultation is emphasised throughout the 
project. 


7. Checking on perspectives by user consultation 
and the use of a range of tools and techniques. 


8. The use of specialist tools and techniques, many 
of which are graphical to aid in communication 
and to act as part of the record of the system. 


9. Early emphasis on what the system will do, or 
logical analysis, before the analyst moves on to 
consider how this should be achieved. 


A model for a strategy for the selection and 
evaluation of software packages is proposed below. 
Other models may be developed, to suit local 
circumstances. The specific stages in the strategy are 
not important, provided all key issues are addressed. 
The existence of a strategy is crucial to effective 
communication and management. The strategy 
proposed comprises the following stages. 


Definition of objectives 

The first stage must be to define the objectives of the 
project. This may commence with an analysis of the 
existing system if this is felt to be an appropriate basis 
for development. This phase probes why procedures 
are performed, what improvements are possible and 
what lends itself to computerization. A new system 
mustnot, however, be hampered by previous practices 
and it may be more fruitful to perform a needs analysis 
and to define the objectives of the new system without 
reference to any earlier systems. A cost benefit analysis 
may be appropriate, but costs of both the existing and 
the new system may be difficult to identify in any real 
sense, and benefits may defy analysis in financial 
terms. Cost benefit analysis can be a useful tool where 
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costs and benefits can be clearly identified. This stage 
should culminate in the generation ofa short document 
(the shorter the better!) which could be described as 
the outline systems specification. This document, 
which should identify the general objectives of the 
new system should form a basis for in-house 
communication, communication with vendors and 
other systems personnel and act as a reference 
document for later consultation in that it lays down 
what the project is intended to achieve. Note that at 
this stage the emphasis is on the application and not 
software or hardware. Perceptions and visions will be 
influenced by prior experience of computer systems, 
and the extent of detail of the outline specification 
will vary in accordance with previous experience, but 
the emphasis must remain with the application. 


Evaluation of options 

This phase concentrates on the identification of the 
candidate software packages, their examination and 
the refinement of the outline systems specification. 
The central activity of this stage is the collection of 
information. Both a general awareness of systems 
available and trends in the marketplace, and more 
specific knowledge concerning some key packages 
must be acquired. Attendance at exhibitions, 
conferences, and seminars offers the opportunity to 
gain an overview and to examine specific products. 
Directories, such as those listed in the references at 
the end of this article list the systems available on the 
market, and facilitate preliminary comparisons. 
Accounts of the application of packages are worth 
tracing in appropriate periodicals, since they offer a 
different perspective in showing how a package can 
be exploited and may trigger new ideas. Software 
packages can be examined at exhibitions, and 
demonstrations at the purchaser's location may be 
appropriate once detailed examination of a specific 
system is required. Evaluation versions of packages 
can be invaluable for the reasonably experienced 
computer user, but may prove a little daunting for the 
new user. Above all else, the range of options, апа” 
perhaps more importantly, the problems that are likely 
to lurk around the next corner, can be gleaned from 
other users and members of user groups. All reputable 
software suppliers should be able to recommend a 
few successful implementations of their system from 
which one or two can be selected for a visit or 
discussion. 

The refinement of the outline specification will 
need to take into account local circumstances. These 
may include hardware constraints and any special 
aspects ofthe application, such as downloaded MARC 
records, or graphic display images alongside text. 
These local constraints may serve to narrow the choice 
of packages significantly. 


Definition 


During this stage attention is refocused on the 
application under consideration. With the greater 
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enlightenment gleaned during the information 
gathering of the previous phase it should be possible 
to specify a number of aspects of the application in 
greater detail. Typical issues that might be addressed 
include: 

e databases to be established 

e database sizes 

e growth rates 

e record structures, for both bibliographic and 

non-bibliographic data 
., € the information that will be sought from the 
database 

ө the form in which information needs to be 

presented 

e theusers ofthe system, theirroles and experience 

in relation to the system 

e how the system might be implemented, 

including personnel involved and project 
timescales. 

The outcome of these deliberations should be 
recorded in a more detailed systems specification 
which should form a basis for detailed discussions 
with systems vendors and any contractual agreements. 
This systems specification is also the detailedreference 
document. 

The systems requirements specification or 
specification of operational requirements is an 
important document. It has three basic roles: 


1, as a communications document to support 
discussion amongst those concerned with the 
development of the system 


2. as a reference document for consultation during 
implementation, maintenance and review 


3. as a legal document which may form part of a 
contract with a supplier 


The focus of the system requirements specification 
is the details of the facilities to be provided by the 
computerized system, with an indication of those 
which are mandatory and those that are merely 
desirable. For instance, the size of the system in terms 
ofthe number of records and transactions to be handled 
is also important. Background and environmental 
information, together with information on any special 
constraints should also feature. Lastly, a timetable for 
the implementation of the system is most important as 
a basis for monitoring progress. 


Selection and design 

Having identified what is required of a system, it 
should be possible to identify which system best 
meets the requirements for a given application. 
Quotations or tenders should be obtained from a small 
number of potential systems suppliers and formal 
negotiations for purchase should commence. For a 
small, cheap system purchase agreements and 
associated contracts may be relatively straightforward, 
but with a larger investment considerable debate may 
be important at this stage. An integrated package of 
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hardware and software needs to be selected and 
the capability of all components tested. Once a 
suitable configuration has been designed, orders can 
be placed. 


Implementation 

After orders have been placed there may be a lull in 
activity whilst delivery of software, and in some 
circumstances, hardware is awaited. The opportunity 
can be taken to review the plans for implementation 
of the system. Implementation should start with any 
necessary building works, office rearrangement, or 
network installation, followed by hardware installation. 
Once hardware has been installed, or if hardware is 
already in place, software installation can proceed. 
This will start with the establishment of a small trial 
database, and associated elements of a trial system. 
Hardware and software, and their capability should 
be tested and any problems resolved, consultation 
with suppliers. 

Once the system is operational, databases can be 
established, and any system design, suchas the format 
of report forms and printed outputs, undertaken. 
Security arrangements, such as the allocation of 
passwords and user ID's and write or read access to 
specific parts of the database need to be put in place. 
Staff training is essential, and appropriate user 
familiarization programmes should be planned. Once 
these preparatory measures have been completed the 
system can move into full operation. 


Evaluation and maintenance 

After a short period of operation initial evaluation 
needs to be conducted. Is the system meeting the 
objectives identified at the beginning of the project? 
If not, where and how can improvements be made? 
Longer term evaluation needs to be conducted at 
regular intervals, say, annually, in order to ensure that 
the system is still meeting its objectives, and, indeed 
that those objectives are still valid. Software 
maintenance will often be provided by the software 
supplier. Upgrades and updated versions of the 
software will be available for implementation at 
intervals, 


CHOOSING SOFTWARE 

The new user of a software package often 
experiences difficulty in drafting an appropriate 
specification of the features that might be useful in a 
software package, and thus experiences difficulty in 
developing the crucial systems requirements 
specification. This specification must be unique to 
each application, but often a general checklist of the 
functions that are normally available in a specific 
kind of software package forms a useful basis from 
which the new user can develop their own specification. 


Checklists can be further developed by examining the | 


features of the systems in the marketplace, and indeed’ 
need to be regularly updated to accommodate 
developments. 4%; 
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Criteria that might be included in a Checklist can 
be divided into two categories: 


1, those that are applicable to all software packages. 

2. those that relate to the specific kind of software 
package. 

Here, we will first briefly review those criteria in the 

first category. 


General criteria for software selection 


1. Cost 
Costis clearly a consideration, but since, in general, 
you get what you pay for, cost should not be a 
primary consideration. Software cost may also be 
a small component of the costs of the entire 
system, and better software may significantly 
reduce operating costs. 


2. Lifetime and life history 

Well established packages, preferably with the 
same supplier throughout their lifetime are to be 
favoured. In many fields packages have now been 
available for ten or more years. Although notall of 
the best packages have done so, it is preferable if 
the package has enjoyed a solid reputation through 
a number of releases. 


3. Originator 
The reputation of the systems house responsible 
for writing a software package is important. 
Experience with other packages supplied by the 
same originator may be used in assessing a new 
package. 

4. Supplier 
With specialist software the supplier is often the 
originator, but with standard business packages 
there is often an agent acting as supplier. The 
user may look to the supplier for support and 
needs to feel confident that this will be 
forthcoming. The supplier’s reputation and history 
should be considered. 


5. Support 
Most suppliers or originators offer some support. 
Good, readable manuals should be the norm. 
Other support may take the form of on-site 
training, off-site training, consultancy, assistance 
in setting up a system, and a help desk. Some 
software packages have associated user groups 
and user group membership may provide a valuable 
source of information on the package. Both the 
quality and the cost of these support elements 
must be considered. 
6. Maintenance 

The software package should be appropriately 
maintained by the supplier. Maintenance involves 
removing bugs or errors; and, improving the 
software so that it incorporates new facilities and 
concepts. Many software suppliers offer mainten- 
ance contracts at about 1096 of the price of the 
original package and this entitles users to new 
releases ofthe software. Other suppliers offer special 
discounted rates for upgrades to existing users. 
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7. "Technical considerations and compatibility 

` The software must run under the operating system 
available in the hardware configuration to be used, 
and must also be available in a version that is 
compatible with the hardware. With the move 
towards UNIX-based systems and extensive use 
of DOS in microcomputer systems, compatibility 
isless ofa problem than it once was, but still needs 
to be carefully checked. 


8. Ease of use 
The quality of the human-computer interface is 
important for any software package. Factors to be 
considered include dialogue design and screen 
display. 

9. Interface and integration 
Most software packages should be able to export 
and import data to and from other packages, ofthe 
same kind, such as between two word processing 
packages and two database packages. Some 
software will also export data to other kinds of 
packages as from, for instance, a database package 
to a word processing package. Other packages 
may be part of an integrated suite of software that 
supports different activities such as word 
processing, databases, graphics and spreadsheets. 
It is important to be able to reuse data in a system 
in different formats so a high level of flexibility: 
should be sought. 


Criteria for specific software packages 

All of the earlier sections of this paper are applicable 
to the selection of any software package for any 
application. Criteria appropriate to specific software 
packages reflect on the facilities offered by these 
packages. A sense of checklists must be developed to 
accommodate different applications. Since I am not 
sure which application areas may be of interest to you 
individually I have chosen to develop a general 
Checklist for the evaluation of database systems. 
Database systems may covera wide variety of different 
applications. This list may be developed to include 
features appropriate to special purpose database 
systems, for example, terminology management for 
the recording of business transactions. Alternatively, 
it may be viewed as a model checklist showing the 
nature of such a checklist and from which it may be 
possible to develop similar checklists for other 
applications. 

Checklist 1 lists the key features of database 
software. The basic database structure and its definition 
are considered first, followed by features associated 
with data entry. Ease of data entry and appropriate 
security to protect data integrity are important. All 
database systems, must consider renewal. In some 
systems, such as text management systems, more 
sophisticated facilities are available for indexing than 
in other systems. Equally the retrieval facilities may 
vary in nature and quality between systems. Once 
retrieval has been effected, data must be output to 
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screen or in printed format. Security of data and 
software must be carefully controlled, especially in 
systems where there are many users. 


Conclusion 

The evaluation of software should be approached as a 
project. A methodology for a strategy for conducting 
the evaluation of software has been outlined. 
Criteria have been offered as an aid in the evaluation 
of software. Both general criteria and criteria 
applicable to the evaluation of database packages 
have been given. 


Checklist 1: a checklist for the evaluation of 
database software 


1. Defining the database 
Basic facilities 
What parameters are specified at definition? 
What parameters can be modified once data has 
been added? 
Display of data structure 
Modification of data structure, both before and 
after database has been added 
Relational database facility 


2. Data entry 
Preformatted screens 
User-definable screens 
Full screen editing and word processing 
facilities 
Easy amendment of existing records 
Easy deletion of existing records 
User-defined fields 
Variable and fixed length fields 
Flexible field lengths 
Use of windows to offer help and display 
authority lists 
Protected screen areas eg. labels 
Data validation 
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. Indexing 


Constraints on indexing terms eg. length 
Stoplists 

Go-lists 

Facilities for handling personal and corporate 
names 

Controlled and natural language indexing 
Selection of fields for indexing 

Definition of field indexing characteristics 
Authority control for names 

Thesaurus control for subject terms, showing 
synonyms, homonyms broader terms and 
narrower terms. 


. Information retrieval 


Boolean search operators 
Field limited searching 
Proximity searching 
Truncation 

Range searching 

Search history 

Save search 
Online help facilities 
Display of index 

Display of thesaurus 

Use of related terms in searching 
User-friendly interface 


. Output facilities 


Pre-formatted screen display 


"User-defined screen display 
. Printed output formats, both preformatted and 


user-defined 
Special print formats and other facilities for SDI 
Table creation and statistical analysis facilities 


. Security 


Passwords 

User aids 

Read only access 

Access restricted to certain records, and/or 
certain fields. 
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Summary 


A system to provide scientific current awareness for R&D staff at SmithKline Beecham Pharmaceuticals is 
described. The system is based on an innovative combination of personal bibliographic software (Pro-Cite +) —for 
capturing and reformatting data from different sources — and the use of a large central database run under BASIS 
+ text management software for duplicate removal and bulletin production. This combination has proven an 
effective and robust platform for the provision of the service on a multi-site, international basis. 


1. Introduction 

In the competitive climate which prevails today, no 
pharmaceutical concern can thrive without completely 
harnessing the scientific talents and creativity of its 
R&D staff. Success in the marketplace hinges, first 
and foremost, on innovation. Only innovative 
companies can expect to reap substantial net benefits 
from the introduction ofnew drugs, given the enormous 
upfront costs of development. 

In order to operate effectively, scientists need to 
keep up-to-date with what is happening in the wider 
scientific world in a number of different ways. This 
includes having good contact with scientists of similar 
interests. Contact can be fostered by attending meetings 
and seminars: networks or ‘invisible colleges’ of 
recognized experts develop in this way. But the main 
means of communicating scientific information is by 
use of the scientific literature. This comprises an 
enormous collection of scientific journals: for example, 
the Institute of Scientific Information indexes around 
4,500 different scientific journals each year — 
containing a staggering 700,000 articles — for the 
SCISEARCH + database. 

Scientists should browse key journals for 
themselves. The original journal articles provide rich 
information about ideas and methodology, and 
browsing expands the personal knowledge base 
through serendipity. However, given the enormous 
volume of scientific literature, scientists additionally 
need to use computerized techniques and/or 
intermediary assistance to provide optimum coverage 
of topics of interest to them. It is the challenge of 
information professionals to ensure that scientists 
have the tools and services available to them that can 
provide this optimum ~ and to do so in a way that 
counters, rather than feeds, the ‘information overload’ 
experienced by all scientific workers. 

The armamentarium for providing current 
awareness is impressive, ifnot bewildering, in variety. 
A large number of secondary sources are available, in 
hard-copy or electronic formats, and in ‘off-the-shelf’ 
or customizable vehicles. Some systems employ 


information professionals in an intermediary role, 
others do not. Many of the options have been recently 
reviewed (Rowley, 1992). 

I describe here a system developed at SmithKline 
Beecham for providing scientific current awareness 
via а series of hard-copy bulletins, making use of data 
derived from different types of electronic sources as 
well as from inspection of original journals. This 
system makes use of Pro-Cite software to reformat 
scientific bibliographic data derived from these varied 
sources. The eclectic catchment, and coherent format, 
of the bulletin information, combined with the 
extensive use of automation, provides the basis for an 
effective, high-quality, and labour-sparing current 
awareness service. 


2. Development of the system 

2.1 Background 

A thorough review was undertaken of the major 
scientific current awareness systems existing in the 
various parts of the company at the time of the merger 
of the Beecham and SmithKline Beckman 
organizations in 1989. There were 3 major systems 
for providing. scientific current awareness to 
pharmaceutical R&D staff: 

@ the Beecham organization provided a weekly 
25-section bulletin produced by manually 
scanning journals and various secondary sources 
— users selected whatever combination of 
sections they needed; 

e SmithKline & French in ће UK produced 2 
bulletins at 2-3 week intervals by similar 
scanning methods, although in this case the 
production of the bulletins was automated via 
use of a BASIS K database; 

ө SmithKline & French in the US used around 
70-80 tailored, weekly, computerized profiles 
produced by the Institute for Scientific 
Information from the SCISEARCH database. 

Overall, these systems directly supplied most of 

the scientific awareness needs of 1200 R&D staff at 8 
sites in the UK and US. A merger-related working 
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group looking at the integration of scientific and 
technical information recommended that a single, 
unified bulletin system be introduced to replace the 
existing systems. This would bring benefits resulting 
from economies of scale and from the co-operative 
team-working of information staff from many R&D 
sites. A special project team — the Scientific Current 
Awareness Project (SCICAP) — was set up to make 
detailed recommendations about the new system. 


2.2 User input 

150 Users were surveyed in depth about their prefer- 
ences regarding scientific current awareness. Each user 
was shown examples of the 3 mainstream products. 

There was much individual variation in response, 

but a general consensus emerged on the main issues: 

€ currency: most view up to a2 week lag between 

the time of journal receipt at the site library and 
notification via the service as acceptable; 

@ comprehensiveness: it is vital for the service 
to be complete in core areas, and to achieve this 
the coverage of journals not held locally is 
essential; some more selective coverage of 
peripheral areas and of items of general interest 
is also important; 

© enrichment of citations: if possible, the 
bibliographic details should be supplemented 
by keywords or short abstracts — this enables a 
better decision to be made regarding the need 
for inspection of the original article; 

€ contact: it is important that good contact exists 
between users and intermediaries in order to 
ensure the full and up-to-date understanding of 
user interests — only this can enable the correct 
selection of material to occur. 


2.3 Design of the system 

The product chosen to best meet the needs of the user 
was Science Alert. This was to develop as a series of 
14 bulletins covering allthe relevant therapeutic areas 
and scientific disciplines. One product, ‘Highlights & 
Reviews', would bring together key papers from all 
these areas, together with current reviews. 

Jt was determined that Science Alert would be 
produced by use ofa database. This would permit the 
elimination of duplicate material and provide for high 
quality output. BASIS K was chosen as the text 
management tool for the database because of its 
proven abilities in these areas. The BASIS K database 
was named BULLETIN. 

The most important element in the production of 
Science Alert is the information intermediary. Each 
of 20 information scientists is assigned 2-3 interest 
areas to cover. Contact with scientific staff is via 
*monitors' assigned by R&D departments. The 
information scientists establish close contact with the 
monitors and collate the composite information needs 
ofthe client groups. The assigned information scientists 
attend the relevant scientific progress meetings 
wherever possible. 
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The client groups each nominate around 20 ‘core’ 
journals. The intermediaries focus their manual 
scanning efforts on these journals. This is to keep the 
manual effort to a sustainable level, whilst allowing 
selection by browsing to occur from these key sources. 
The intermediaries may select brief abstracts and may 
add some keywords to items. This provides for a 
degree of citation enrichment. In order to assure good 
timeliness for the product, selected data is input into 
the database within a few days of journal receipt. 

The manual scanning of core journals cannot, by 
itself, ensure the completeness of coverage required. 
The SCICAP group therefore sought other data sources 
that could be exploited to provide more complete 
cover. Two data sources were found to be most 
suitable on grounds of their coverage and currency. 
These sources were Current Contents on Diskette + 
from ISI and EMBASE - from Excerpta Medica. A 
survey of currency comparing the time-of-journal- 
receipt (TJR) of 100 journal issues at one of our sites 
to that of items from these journals appearing in these 
products confirmed a very impressive currency for 
Current Contents material: it was, on average, precisely 
concurrent with the measured TJR. EMBASE material 
was several weeks less current in our survey, but 
benefits from substantive indexing and (searchable) 
abstracts in many cases: this provides a mechanism 
for enhancing search precision and recall. EMBASE 
has been particularly useful in areas where the lesser 
precision of the Current Contents on Diskette search 
profiles has been a problem. Our findings on the 
excellentrelative currency ofthe ISI product confirmed 
other findings comparing the related SCISEARCH 
database with EMBASE (Stefaniak, 1990). 

Following the agreement of the producers to 
permit use of these data sources, the SCICAP group 
turned to the technical problem of providing a good 
means of data input to the BULLETIN database for 
data from 3 very different data sources: manual input 
from original journals; data from the diskette system, 
Current Contents on Diskette and from the online 
database, EMBASE, available via vendors such as 
Dialog and Datastar. 


3. Use of Pro-Cite 
3.1 Selection of Pro-Cite 
We faced 2 problems regarding the input of data into 
the BULLETIN database. The first, and more 
significant, problem was the need to accommodate 
data derived from the 3 different sources. We did not 
want to have 3 very different procedures for inputting 
data, reasoning that this would lead to unwanted 
complications — and consequent inefficiencies — in 
the overall system design. Given the need for speed 
and efficiency to ensure the overall currency of the 
data, it was clear that we required a straightforward 
production process without avoidable procedural 
variations. 

The second problem related to the geographical 
dispersion of the staff involved in selecting and 
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inputting the data, and in the relative inexperience of 
many of them with automated methods of working. 
This demanded a robust and uncomplex system for 
data input. In fact the majority of the staff carrying out 
manual data input were to be homeworkers, and data 
was to be channelled into the main BULLETIN 
database from 9 sites in the UK, US and France. It 
would not be possible to provide intensive technical 
support for the system at all these locations. 

These considerations militated against the 
otherwise obvious solution of inputting data directly 
into BULLETIN via standard data input screens for 
manual input, and via separate — and different — batch 
reformatting procedures for the electronic data from 
Current Contents on Diskette and EMBASE. Standard 
BASIS K input screens, although effective, are not 
noted for user friendly features. Further, direct input 
in this way would lead to database validation messages 
that could be incomprehensible to homeworkers and 
other remote staff: the handling of validation problems 
would be best left to a small number of central 
technical staff who were well-versed in the system. 
Reformatting the electronic data would require 
tailoring 2 different, specific solutions — and these 
might well have to change over time as the data 
producers and/or vendors introduced changes in the 
format of the data. 

These factors led us to the consideration of personal 
bibliographic software such as Pro-Cite. At first 
thought it seems paradoxical to use one kind of 
database system — eg. Pro-Cite — to feed another — ie. 
BULLETIN, but in practice the 2 systems 
complement each other very well. Whereas the BASIS 
database, BULLETIN, provides a central platform 
with powerful REPORT features for the automatic 
assembly of bulletins, the personal database software 
offers an off-the-shelf solution in the following 
problem areas: 

@ This kind of software is strong in the area of 
reformatting data from a wide variety of 
electronic sources. For example, the Biblio- 
Links+ companion to Pro-Cite is able to 
interpret online searches from a wide-variety 
of databases such as EMBASE and MEDLINE 
on vendors like Dialog. Extensive customization 
on theusers part is not required. The software is 
supported so that changes to the vendors format 
can be dealt with if they occur. 

€ Rich output format is also a feature emphasized 
in this kind of package. This enables, with 
some moderate customization, the output of 
records in the field-labelled input format 
required for the BASIS database BULLETIN. 

ө The early production steps could be run 
autonomously on stand-alone personal 
computers. This minimizes communications 
problems between the central database and 
remote input sites. The data could be dispatched 
on diskettes from homeworkers or sent via 
standard electronic mail from other sites prior 
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to batch input. Central staff would be on hand 
to resolve any validation problems that might 
arise on input into BULLETIN. 

We evaluated several possible solutions before 
deciding to use Pro-Cite. We chose Pro-Cite because 
we considered it to be a mature and robust package 
offering the full functionality that we required. In 
particular, it offered clear, helpful menus and screens, 
assisted by good use of list boxes, anda straightforward 
system for inputting or editing records, regardless of 
their source. In contrast, some systems offered good 
means for initial manual input, but subsequent editing 
of these records (or of downloaded data from electronic 
sources) was not possible using the same screen and 
methodology. The use of Pro-Cite for reformatting 
data from Current Contents on Diskette and from a 
variety of CD-ROM databases to provide the basis of 
a current awareness system has been described by 
Hanson, 1990. This application differs from that of 
SmithKline Beecham in that Pro-Cite is used as a 
final repository for the data as well. 

The overall input scheme depicting the relationship 
between the autonomous Pro-Cite systems and the 
central BULLETIN database is shown in Figure 1. 


Figure 1: Interrelationships between Pro-Cite and the 
BULLETIN database 
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3.2 Tailoring Pro-Cite to our needs 

The core database functionality of Pro-Cite hinges 
upon the template or ‘workform’ structure for the 
records. The workform is a particular selection of 
record fields from a standard and invariant list of 45 
master fields. These master fields can be renamed if 
required and most of them can be employed fairly 
flexibly, but some do hold certain implicit 
characteristics depending on their type (eg. author, 
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date, etc.). The field order is fixed, For general personal 
bibliographic management, Pro-Cite can switch 
between workforms so as to use the one best suited for 
the record type in question. However, it is possible for 
the user to define a workform, and we took advantage 
of this feature to enable us-to design the simplest 
possible template for data input and editing purposes. 
The user-defined workform is illustrated in Figure 2. 
Notice the Chapter field (Chap): the value of data in 
this field is used by BULLETIN to determine which 
Science Alert product and chapter the item has been 
selected for. The Dump field is used for holding 
source data which is of value for screening purposes 
during the selection of Alert material in Pro-Cite Edit 
Mode but is not required to be exported into the 
Bulletin database. 


Figure 2: User-defined Field Template in Pro-Cite 
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In order to carry out manual data input, a Pro-Cite 
database is opened from the introductory menu and 
the default option ‘Edit, Insert or Delete Records’ is 
selected. Once entered into the database, a context- 
sensitive Option Bar at the base of the screen is used 
to select actions such as ‘Insert’ — which command 
displays the workform fields ready for input — and 
thereafter offers input assistance via some limited 
word-processing functionality. A useful input aid 
relevant to this application is the provision of standard 
lists of journal titles which can be browsed and selected 
directly into the journal title field as required. At the 
end of each working day the records in the database 
are transferred via use of the ‘Print’ command to а file 
ready for transfer via electronic mail over the R&D 
computer network to the BULLETIN database. The 
output format is determined by a ‘Customization 
File’ that is selected from an Option Box activated as 
part of the Print procedure. The appropriate 
Customization File is tailored to provide output suitable 
for direct batch input into BULLETIN (see Figure 3). 
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Additional management information relating to the 
site and source of origin is automatically added to 
each record at this stage. 


Figure 3: Formulated Output from Pro-Cite Ready for 
Input into the BULLETIN Database 


ZA @9=ALERT @44=FR-TY @2=Pan JM, 
Hong JY, Yang CS @3=Post-Transcriptional 
Regulation of Mouse Renal Cytochrome-P450 2E1 
by Testosterone (25-1992 (240—299 (941-1 
(842-110-115 @8=h2 @20=steroid, mRNA, 
Tfm mice @10=The sex regulation of this form 
of cytochrome P450 is shown to occur 
predominantly at the post-transcriptional level. 


The output format has been tailored to the input 
requirements of the BULLETIN database via the 
BASIS Free.Form.A batch input procedure. 

#А is the Start-New-Record key. @ introduces a 
field number and its value. Note that management 
information for fields 9 and 44 is generated 
automatically for each record at the Pro-Cite 
output stage, and relates to the product destination, 
site of origin and source of data. 





Data from EMBASE can be captured into a similar 
Pro-Cite database by use of the Biblio-Links utility. 
Biblio-Links can, with minimal customization, 
interpretthe complex electronic output from an online 
search to identify records and record fields from a 
variety of databases. We decided to use the Dialog- 
version of Biblio-Links after unsuccessfully attempting 
to modify the BRS-version to our needs using data 
from the online vendor, Datastar (whose retrieval 
language resembles BRS output). Once the EMBASE 
records have been downloaded into Pro-Cite, they 
may be edited or deleted in a straightforward and 
facile manner using the same workform as for manual 
input. 

Data derived from Current Contents on Diskette 
may be loaded into Pro-Cite in one of 2 ways. The 
most obvious way is to make use of the Pro-Cite 
output format on Current Contents on Diskette which 
has been specifically provided by ISI for this reason. 
This is then transferred directly into Pro-Cite by use 
of the ‘Import’ command. This works very well for 
most purposes but forces the adoption of a workform 
other than the user-defined one required for this 
application. It is possible to programme ways around 
this problem but we did not find a solution which was 
reliable enough for our needs. For this reason we 
turned to the second mode of transfer, which is to use 
the Dialog-Medline output format provided within 
Current Content on Diskette and then to use Biblio- 
Links in the same fashion as for the EMBASE material 
above. This provides a very robust data transfer system. 


4. Discussion 

The Science Alert system currently provides around 
5000 weekly bulletins to around 1500 subscribers, 
involving the capture and processing of 2-3000 items 
each week. The final format of items within the 
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bulletins is depicted in Figure 4. User surveys have 
established that the system is successful in meeting 
the expectations of most subscribers. 


Figure 4: Format of Entries in Science Alert 


Post-Transcriptional Regulation of Mouse 
Renal Cytochrome-P450 by Testosterone 

Pan JM, Hong JY, Yang CS 

Archives of Biochemistry and Biophysics (1992) 299 # I, 


110-115 

The sex regulation of this form of cytochrome P450 
is shown to occur predominantly at the 
post-transcriptional level. steroid, mRNA, Tfm mice 





The system has been operating efficiently and 
without significant problems for over 18 months: 
both the BASIS K elements and the Pro-Cite elements 
have proven to be robust and effective. Overall, the 
increased automation of the production process and, 
in particular, the user-friendliness of the Current 
Contents on Diskette and Pro-Cite systems, have 
resulted in significant savings of human time and 
effort. A benchmarking exercise carried out to compare 
resources involved in current awareness provision in 
12 pharmaceutical concerns established that these 
savings, coupled with the economies of scale achieved 
from working in a multi-site, international manner, 
provide SmithKline Beecham with a high-quality 
current awareness service from a comparatively modest 
investment of human resources. We are now 
progressing with plans to further upgrade our 
scientific current awareness system: this will involve 
provision of a highly-automated, personalizable 
system. The role of information intermediaries will 
be to facilitate user access to the system, and not to 
select material as at present. 
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Creating hypertext documents: is it worth the effort? 


Duncan Langford and Peter Brown 
University of Kent at Canterbury 


Abstract 


Producing an effective hyperdocument requires considerable work and a surprising diversity of skills. In addition to 
being good writers, the authors should have abilities in design, user-interfaces, testing, structuring and maintenance, 
This paper will analyse the work that goes into producing production-quality hyperdocuments and what the cost/ 
benefits are. It is assumed that the reader, though not necessarily a hypertext expert, knows what hypertext is. 


Introduction 

As we are all aware, information is increasingly being 
stored and accessed in electronic form. Although there 
are many advantages — facilitation of storage, updating 
and duplication, for example—ease of access is normally 
considered a principal benefit. However, if data volume 
is large, or the density of information is high, some 
specialized help is likely to become necessary for both 
reader and producer of electronically stored 
information. Hypertext may be the means to provide 
this help. 

Hypertext — linking of information nodes in a non- 
linear way — has been successfully used as an interface 
between a user and an information system in a variety 
of public service settings. Probably the application 
best known to the general public is the pictorial index 
to the National Gallery, and, to information 
professionals, the new OPAC (Online Public Access) 
catalogue of the British Library. 

There are many possible areas within the field of 
professional information management where a 
hypertext approach might prove advantageous. Such a 
system is able not only to reduce the load on scarce 
trained staff, but actually increases effective usage of 
material. Facilitating the access of users to information 
is of potential benefit to everyone. 

This paper is not intended to establish the 
importance of hypertext within information systems; 
the case is largely made. The next step is the practical 
consideration of a hypertext system. Drawing on both 
academic research and the experiences of ten years of 
commercial hypertext development, we address this 
issue directly. 

Our conclusions indicate that design, construction 
and maintenance ofa hypertext document is far from a 
trivial task, and that creating a well designed and 
effective hyperdocument is consequently likely to 
involve considerable overhead. 


Task identification 

Rather than describing a generic theoretical model, as 
an illustration let us consider the likely needs of the 
manager of a major information resource, such as a 
library. In essence, such a manager is in the position of 


developing and maintaining an effective interface 
between potential users and the information for which 
he is responsible. Although in most libraries it is 
virtually certain that some form of electronic indexing 
will be in use, there are still obvious benefits in further 
facilitating the interaction between user and material. 
Looking to the future, data which is entirely held in 
digital form is a growing field — the new Library of 
France anticipates holding some 300,000 such 
documents when it opens in 1995 (Holderness!). 

A hyperdocument, such as a hypertext ‘front end’ to 
an electronic catalogue, would be a natural develop- 
ment, while a hypertext-based interface to electronic 
media offers even greater advantages. Let us assume 
that a library is preparing a hyperdocument that will 
guide users on where to find material in the library. 

When considering the use of hypertext, there first 
needs to bea detailed preliminary investigation, leading 
to a clear understanding of exactly what the scheme is 
intended to achieve. What other choices are there—and 
is hypertext the right solution? It may be that this 
preparatory examination shows introduction of 
hypertext is not appropriate: a discovery best made as 
early as possible. 

Once it has been decided to establish a hypertext 
system, it is deceptively easy to immediately begin 
implementation. This is typically done by drawing on 
existing information held electronically, and 
manufacturing what appear to be appropriate links 
between sections. An understandable approach, which 
is inexpensive, easy — and wrong. Although initially 
much easier and cheaper to create, a system built of 
unplanned connections is inevitably confusing to 
hypertext user and author alike. 

It is essential that, as a first stage, a clear definition 
of the task is formulated. It is important that such a 
definition is well founded, and there are strong grounds 
for the involvement even at this stage of potential system 
users. Indeed, although preliminary research should 
be carried out by an authorship team with an effective 
knowledge of hypertext, able to accurately analyse the 
proposed application and compare it with others, it is 
probable that user input would add further useful 
information. 
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At this point in the development of a hypertext 
application there is much in common with the 
introduction of new computer systems where work is 
preceded by systems analysis. It is also true that unless 
the purpose of any system is clearly defined, an effective 
outcome is unlikely. 

It is important that the intended objective is both 
practical and achievable with available resources. 
Constraints of budget are central — hypertext can do 
much, but it is not possible to implement a workstation 
system on a PC budget. In the present example the size 
of the catalogue, and its complexity, must greatly 
influence system design decisions, determining 
consequent essential costs in time and equipment. 

The overall size of the budget should be realistic; 
and development costs may be high. As will be seen 
later, though, it is crucial all resources are not spent on 
development. | 

Even before a line of the prospective hypertext 
application is written, there are unavoidable costs 
associated with feasibility studies and accurate system 
planning. 


Writing it 
Contrary to general belief, constructing effective 
hypertext is not simple. This is partly becaus2 one of 
the greatest strengths of hypertext — adaptability to a 
range of different uses, in a wide variety of settings — 
has tended to work against the development of simple 
general rules. However, a few general guidelines have 
emerged from our research and experience in 
developing commercial systems. They are by no means 
exclusive, and are not intended as an exhaustive 
selection: 

ө Are the proposed links appropriate? 

ө Are too many links nested? 

@ Are links accurate? 

e Has sufficient attention been given to the user 

interface? 

e Isthe system usable? 

e What are user preferences? 
We shall consider each of these in more detail. 


Are the proposed links appropriate? 

Linking is intrinsic to hypertext. The interconnected 
nature of information within a hypertext places a 
heavy responsibility on an author, first to supply 
appropriate linking, and then to provide a consistent 
pattern of links. What are appropriate links will vary 
according to both the data set and the pattern of antici- 
pated use. The temptation for inexperienced av-thors is 
to provide either a ‘dense’ hypertext, where virtually 
everything which could be linked has been given a 
connection, or ‘focused’ text, where only a narrow 
author-defined path is available through the data. 

An extreme example of dense hypertext might be 
where every word in a passage is linked to a full 
dictionary definition — potentially, thousands of links. 
Dense hypertext is very expensive in author time, and, 
unless verified by user evaluation and testing, may 
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actually be counter productive in user effectiveness. 
Recenttests at UKC (University of Kent) ofa hypertext 
XWindows tutorial (written in UNIX Guide) revealed 
that some users were intimidated by the perceived 
complexity ofevena carefully designed ‘dense’ system. 
(Langford?). 

The nature and size of the current dataset must be 
considered when linking decisions are made. What 
may actually be an ideal link density for the contained 
material may in practice have to be modified if, for 
example, links have to be made by hand in a large 
volume of data. When assessing size, the volume of 
data it is intended will eventually be included in the 
system should be used, rather than a — potentially 
much smaller — starting measure. 

Consistency in establishing links is essential in 
building user confidence. Rapid confusion will result 
if link density in similar material varies according to 
the time it was added. 


Are too many links nested? 

A ‘nested’ link is one that is contained within another. 
Inother words, depending on the application design, to 
reach a particular location in the hypertext it may be 
necessary for a user to have first selected and displayed 
a series of previous links. An example might be where 
a reader wishing to discover the age of an author may 
first have to move through several links concerned 
with biographical details. Such multiple, nested links 
are in practice frequently used, particularly when 
converting a conventional 'flat' data structure with 
many subheadings to hypertext format. However, 
research, including the recent UKC tests, has shown 
that a substantial percentage of users often prefer a 
shallow text. A deeply nested hypertext is not 
necessarily always the best choice. 


Are links accurate? 

This point may be an obvious one, but its importance 
cannot be overemphasized. However well designed 
they may be, if links are not accurate then they are of 
no value. It is therefore essential that the pattern of 
linking is independently tested, ideally by authorities 
on the relevant subject. Testers should not have been 
previously concerned with the linking process. Such 
testing can be expensive, in terms of both direct costs 
—paying an expert—as well as in inevitable delay to the 
final system. 


Has sufficient attention been given to the user interface? 
Is the proposed user interface simple, and cognitively 
appropriate for typical users? Research suggests that 
navigation of hypertext is easier for users if they are 
able to either build or accept a workable cognitive map 
of the structure and arrangement of contained 
information (Streitz et al 3), If the author decides to 
avoid actual provision of a diagram showing 
arrangement of the data, it is even more important that 
presentation of data and associated linking are 
reasonable, and that cognitive demands on the user are 
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not excessive. The hypertext should be designed as a 
coherent whole; paths through the contained 
information should appear logical without specialist 
knowledge. The way in which users are expected to 
interact with the completed system is an important part 
of the hypertext. 


Is the system usable? 

Throughout hypertext development, the emphasis 
should be on function and simplicity. It should always 
be borne in mind that the purpose of the design is to 
provide a workable system for actual users, who do not 
tend as a class to appreciate unnecessary complexity. 
When developing any computer system it is easy to 
lose sight of how it may appear to those less familiar 
with the work. Maintaining an emphasis on usability is 
important, but confirming that the system is going to 
be usable by its clientele should not wait too long. We 
suggest usability of the final system will improve if 
there is involvement of prospective users throughout 
the development process. 


What are user preferences? 

Even though some prospective users may have been 
involved earlier, it is still essential that a hypertext 
system is fully tested before being officially released. 
Itis notuntil the full system is available that patterns of 
use emerge, and the preference of 'real' users become 
more clear. Testing should allow for rough edges in the 
system to be smoothed, and practical identification of 
potential problem areas. It may also reveal unexpected 
information about actual use ofthe system, which will 
often be different from what the authorship team 
anticipated. For example, the UKC tests revealed a 
surprising number of subjects preferred to examine a 
small section of hypertext in detail, largely ignoring 
the anticipated route through the data. 


Looking after it 

It is well known that, for any computer software that 
sees widespread successful use, the maintenance cost 
fare exceeds the original development cost. The term 
*maintenance' covers all changes that need to be made 
to the original product. These changes result from 
bugs, extensions, reaction from users, changing of 
data, changing of the hardware/software platform, etc. 
Typically maintenance consists of a large number of 
apparently small tasks, each of which takes a 
surprisingly large time. 

Maintenance is as much an issue with hyper- 
documents as with any other software products. Hence 
any exercise to evaluate the possible cost/benefits ofa 
hyperdocument must pay special attention to the 
maintenance stage. Furthermore the design and 
implementation ofthe hyperdocument must concentrate 
on making a product that is easy and cheap to maintain, 
and not dependent on computer gurus to keep it going. 

Because this is such an important matter we will 
show two fairly detailed examples, in order to throw 
light on the problems that arise. 
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Example 1: A change in the outside world 

Assume that when a user selects a button in the library 
hyperdocument a program is run behind the scenes. 
The program might, for example, consult the library 
catalogue, Three years after the original development 
of the hyperdocument, this program suddenly stops 
working. The task of the maintainer is to find out why. 
Perhaps there has been a small *improvement' in the 
operating system, perhaps the computer's filing system 
has been reorganized, perhaps something internal has 
overflowed. The maintainer tracks down the problem 
and finds it was due to an improvement in the operating 
system: an operation used by the original author 
is ‘non-standard’ and no longer works; it now needs to 
be done in a different way. The maintainer changes the 
piece of program that caused the problem, and 
also identifies any other programs in the hyper- 
document that require a similar change. Unfortunately, 
though the changes initially seem successful, it turns 
out that they have caused other parts of the 
hyperdocument to go wrong. Further changes are 
therefore needed. 

What was the author's reason for using the non- 
standard feature, and why do other parts of the 
hyperdocument depend upon it? Unhappily, the original 
author has now left the library, and, though he left 
some documentation of his work, the documentation is 
hard to follow, and certainly does not cover every 
design decision. 


Example 2: A change in structure 

The first example shows problems that apply to 
hyperdocuments because they are computer software. 
This second example shows problems specific to 
hyperdocuments. 

Assume that, in its initial design, the library 
hyperdocument has a structure in which all the 
information is segmented into topics and sub-topics. 
When the system is used, this structure causes problems 
to inexperienced users, who often do not know which 
sub-topic their required information is classified under. 
To remedy this problem the structure needs to be 
changed. Changes in structure are, indeed, a common 
need, because authors will never get the structure right 
first time, even though the worst problems may be 
eliminated by the initial user testing. 

In almost any hypertext system it is easy to change 
the content, e.g. to change words and/or pictures. 
Changing the structure is, however, a fundamentally 
complex problem, even if the hypertext system that 
you are using provides assistance for such a task. 
There is bound to be a lot of moving about of 
information and changing of links, and it is realistic to 
expect to take a month to make a significant structural 
change in a large hyperdocument. If we are unlucky 
and our now-departed author had built a house- 
of-cards structure where each piece is highly 
dependent on its neighbours, then changing the 
structure may necessitate a complete rebuild, taking 
many months. 
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Making maintenance easier 

There are no easy answers to reducing the burdens of 
main-tenance, but nevertheless care in selecting 
hardware/software, in selecting staff and in defining 
working practices can all yield good rewards. Some 
particularly important points in aiding cost- 
effectiveness are outlined below: 


e try to avoid hypertext systems that have esoteric 
internal formats. Hypertext standards are, 
unfortunately, a long way off, although the 
HyTime standard has been approved. In the 
meantime, aim for a hypertext system that 
supports an open file format that can be read by 
lots of different tools. 


e encourage staff to use a straightforward 
unexciting approach, rather than a clever 
approach that sails close to the wind. The wind 
will change. 


e a killing maintenance problem is caused by 
authors who fill their hyperdocuments with a 
huge number of links that give the appearance 
of having been chosen according to the author's 
current whim. (As we have said, this can be a 
killing problem for readers too.) To combat this, 
specify a house-style for your hyperdocument. 
This house-style will define the strategy for 
what is to be inter-linked, and will specify a 
uniform style of presentation for the user to see. 
Above all the house-style should be a discipline 
for all staff on the project. 


e builda test-suite. This is a systematic and (semi- 
) automatic testing procedure that can be applied 
before each new release of the hyperdocument. 
Ideally your hypertext system should help: it 
should, for example, be able to test that all links 
are properly connected and that all ‘programs’ 
embedded within the hyperdocumentare at least 
syntactically correct. This mechanical testing 
complements the testing by humans that we 
have already mentioned. A mechanical test can 
test if a link is there, but only a human can 
decide if it is wise. 


@ require authors to document what they are doing. 
Although a house-style imposes a discipline it 
never will and never should deny the authors all 
freedom of choice. When the author makes a 
choice he should explain it, for the benefit of 
subsequent maintainers. Ideally the hypertext 
system should allow authors to insert comments 
in their hyperdocuments ("This link is present 
because...”). Such commentary is, of course, 
invisible to ordinary users. At the University of 
Kent, when we set hypertext authorship 
assignments, we warn students that they will 
lose marks if they do not insert commentary 
explaining what they are doing. The result has 
been a considerable improvement in the 
understandability and hence the maintainability 
of hyperdocuments. 
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Automatic change 

A way of taming maintenance costs is to automate 
change. This can only be done in restricted 
circumstances, but we will give two examples to show 
the potential. 


Example 1; Sharing information 

Iftwo separate documents contain the same information 
this is certain to lead to maintenance problems. Assume, 
for example, that one document, document X — not 
necessarily a hyperdocument, defines the classification 
system in which the book stock of the library is 
organized. In addition, a library hyperdocument uses 
this same classification to help explain to users how to 
find books. Assuming that the classification changes a 
bit over time, it is certain that eventually the two 
documents will get out-of-phase: someone will update 
one of them to reflect a change and will forget to 
update the other. 

The best way of avoiding such problems is to 
absorb the two documents within a unified docu- 
mentation system, but this is more easily achieved in 
theory than in practice. A more realistic solution is to 
designate one document — document X, say — as the 
master, and to make the hyperdocument an active 
document (Amon et al *) in the sense that the 
classification system is not wired in, but is generated 
on-the-fly from document X whenever it is needed. As 
a result, any alteration in document X will be 
automatically reflected in the hyperdocument. 


Example 2: Automated checking 

We have emphasized that it is vital to have a test-suite 
to validate a hyperdocument. Our second example 
extends the principle of the first example in order to 
cover the test-suite. 

Assume that the library hyperdocument contains a 
number of instances of titles of individual books in the 
library — they may, for example, be used within real 
examples of procedures for finding a book. Assume 
further that the mark-up in which the hyperdocument 
is represented captures the fact that a particular string 
of words represents a book title. This second assumption 
is a crucial one: it is now widely accepted that the key 
to capturing information is to capture its logical nature, 
not just its appearance (e.g. that it is represented in a 
particular font); mark-up languages such as SGML 
allow logical structure to be captured. 

Given these two assumptions it should be a trivial 
job to write a program (part of the test-suite) to extract 
all the book titles contained in the hyperdocument and 
check them against the online library catalogue. (Such 
a program might, for example, be represented in under 
ten lines as a Unix shell script.) This checking program 
would reveal all book titles that were wrongly cited, 
for example because they were misspelled or were no 
longer in the library catalogue. 

Checks such as this cost a bit of time and effort, but 
considerably reduce the cost of maintaining an accurate 
hyperdocument over time. The guiding principle is 
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that if a hyperdocument is of good quality it will last 
longer, and will therefore be more cost-effective in 
spite of the extra effort spent on ensuring its quality. 


A costed project 

The potential benefits of a hyperdocument are to 
provide better information quicker. Although the costs 
of developing a hyperdocument are easy to measure, 
the benefits are not. (The development cost of the 
National Gallery system was, incidentally, about 
£600,000, though it is more than just a hyperdocument; 
a significant part of the cost was the capture of high- 
quality images of paintings.) There is, however, one 
published instance where the benefits of a hyper- 
document have been quantitatively evaluated. This is 
the LOCATOR system, which is used by ICL to help 
diagnose fault reports. A hyperdocument — in this case 
based on the Unix Guide hypertext system, (Brown?) — 
is used by an engineer to help diagnose a fault being 
described by a customer at the end of a telephone. In 
essence, the engineer asks the customer questions 
dictated by the hyperdocument and selects a button in 
the hyperdocument according to the customer's answer. 
The button leads to further question (often lying in a 
completely different part of the hyperdocument) until 
a final diagnosis is reached. Before the introduction of 
hypertext, the information was stored on volumes of 
paper, and the engineer had a thankless task jumping 
from one volume to another. 

The benefit of LOCATOR was that the diagnosis 
rate improved from 66% to 85%. As a result field 
engineers are used much more efficiently: each wrong 
diagnosis can result in a field engineer being sent on a 
fruitless visit to the customer, or being sent out with 
the wrong spare parts, both representing a large wasted 
cost to ICL. 

The LOCATOR work is described in papers by 
Rouse! and Lewis’. The latter paper, incidentally, 
describes the considerable investment that ICL made 
to create a tool to enforce a house-style. The benefits of 
this in reduced maintenance costs nevertheless justified 
the large initial investment. 

The actual costs of the LOCATOR project are 
company confidential, but the bottom line is that the 
project had to justify itself on a cost/benefit analysis, 
and the improved diagnosis rate amply achieved this. 
Clearly the benefits of getting quick and accurate 
information to field engineers are easier to measure 
than the benefits of helping a library user to find what 
they need and to find it quickly. Nevertheless the 
success of the former gives hope to the latter. 


Summary 

The key to cost-effective creation of hypertext 
information is to make sure that the product is good to 
start with and that it remains good over a long time. 
Sadly, skimping on the original cost is a certain way to 
ensure that a product is not cost-effective: even if it is 
good to start with, it will not be maintainable over a 
long period. 
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Particular aids to producing a cost-effective product 
are: 
e clearly identify a realistic goal. 


e ensure accurate and relevant linking, in a 
document of uniform density. 


e involve potential users throughout the devel- 
opment process, and take notice of their views. 


e develop a house-style. 


ө have a systematic testing procedure to help 
prevent further bugs creeping in during the 
maintenance period. 


e make sure that authors document their strategy. 


e make the hyperdocument work naturally 
with other documents, and avoid having 
information that needs to be updated in more 
than one place. 


Obviously, even if you try to follow the above 
precepts you will not get everything right. 
Nevertheless we believe that, even with some 
mistakes, there is plenty of potential for creating a 
host of satisfied library customers, pleased that their 
information needs have been met in an effective and 
user-friendly way. Success does not, however, come 
cheaply, and the actual writing of a hyperdocument 
is only a part of the overall cost. A minimum 
development budget is likely to be tens of thousands 
of pounds. 
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Abstract 


This paper will argue that current attempts at making systems user-friendly for businessmen and the public at large 
are doomed to failure because they do not deal with the key issues of document organization, selection, classification, 


summarization, and interpretation. 


Introduction 

The reason that the online information business has 
failed to attract more end users is that it has been trying 
to market a service deriving from the philosophy, 
culture and needs of the scientific community. 

The ethos of the research scientist is that bias 
distorts research. Classification and interpretation were 
felt to bias the individual scientist’s search for the 
truth. My close colleague and mentor, Dr Tony Kent, 
(who pioneered the Royal Chemical Society’s first 
online information systems) consistently makes the 
point that information scientists were subject scientists 
— mainly chemists — specialized in searching through 
the literature for items of interest to research teams. 

The interests of the wider business and enter- 
tainment market are quite different. Giving a 
businessman an online system, whether it be full-text 
or abstracts, simply makes a bad information overload 
problem worse. Reading time for business decision 
makers is at a premium which is why they need such 
high value information. 

Evidence will be introduced from Trend Monitor’s 
direct experience indicating that groups of people can 
work together using common classification terms so 
that information processing can be made into systematic 
working which relies on human judgement and 
intelligence, not arbitrary computer algorithms. 

It will then conclude that before the online and 
disc-based information media can finally become a 
commercial success, it must come about that 
information must be processed by people in order to 
add value to it. Itpoints to early examples of Information 
Refineries as being the harbinger of the new ‘media’ 
industry that will make multimedia electronic databases 
into one of the top growth industries of the 1990s. 


1. Theculture and practice of information science 
1.1 The ethos and culture of science 

The very nature of science is about research. The hunt 
for the ‘truth’ is the main purpose. Scientists take a 
great deal of time and effort in order to ensure that their 
discoveries are original. Put coarsely, the ethos of 
science has been about delving deeper and deeper into 
a subject. Successful scientists are ‘hyper-experts’ 
who have a great deal of knowledge about worlds that 
laymen can hardly conceive. Despite the growing 


movement towards a more interdisciplinary approach, 
scientists are still, quintessentially, not only hyper- 
experts, but are also super-specialists. 
Scientific research takes two forms: 
ө experimenting in order to test a theory, 
ө judging the results of other people’s theorizing 
and experimentation. 


Since the 1930s, the explosion in the amount of 
theorizing and experimentation has been so great that 
increasingly experimenters and theorizers cannot even 
hope to be able to keep up themselves with what their 
colleagues were doing. It became necessary for other 
scientists — considered not quite so high up in the 
scientific pecking order — to search the literature in 
order to find out what others were doing in the area of 
specific research that was being undertaken by the 
experimental team. 

According to Dr Tony Kent, who devised and 
wrote (in Assembler) one of the UK's very first 
electronic online information systems for the Royal 
Chemical Society, the term ‘scientist’ in the phrase 
‘information scientist’ stands for the scientists on the 
research team whose job it was to search through the 
literature. Tony always insists — and he was there at the 
time — that literature researchers took on the name of 
information scientists, not because they were studying 
a science of information, but because: they were 
chemical scientists whose job it was to search for 
information pertinent to the work being done by the 
research team of which they were part. It is no accident, 
therefore, that more often than not when you scratch 
somebody who calls himself an information scientist 
you find a chemist underneath. 

Asithappens, inthe late sixties and early seventies, 
chemistry led the sciences into using the technology of 
online text retrieval systems which was invented by 
Software pioneers at about that time. The disciplined 
and very internationally accepted way in which 
chemistsuse language to describe chemical compounds 
made free text retrieval a very powerful search tool. 
The reason is that text retrieval excels in the recall of 
information so long as the searcher knows exactly 
what he is looking for beforehand. This word by word 
recall became a powerful tool in the hands of the 
literature researchers. I believe that one of the reasons 
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for the huge commercial success of the chemical and 
pharmaceutical sciences is the literature research 
undertaken by information scientists using online 
information retrieval systems. 

Indeed, so successful were they that Jason Farradane 
founded the Institute of Information Scientists, and 
was appointed to a new Chair of Information Science 
at City University, London. Information Science was 
then reconceived as being a science in its own right, 
rather than a series of search practices useful for 
finding scientific research data. Another result of the 
success of these ‘information science’ techniques in 
the fields of scientific and technical information was 
that it became fashionable to think that they should be 
applied to business information, and the illusion that 
one day everyone would become an online ‘end-user’. 

These initiatives have, I believe, been inimical, 
both to the development ofan information management 
profession with rules of practice similar to other 
established professions and to the wider use of online 
and disc-based information systems as business 
information tools. 


1.2 The practice of information science: 

In the early days, documentalists and literature 
researchers evolved a series of practices, using trial 
and error, as well as imagination and intelligence in 
order to deal with the need to find material in the 
literature pertinent to the research project on which 
they were working. 

By and large most of these socio-technical (man / 
machine) procedures had already developed long before 
information science, as such, became established to 
claim them as its own. I refer — to name but a few — to 
KWIC (key word in context), KWOC (key word out of 
context); nested boolean algebra; thesauri containing 
synonyms and tables of broader and narrower terms 
used for both indexing and free text retrieval; SDI 
(Selective Dissemination of Information); and 
sophisticated text character substitution routines which 
can be used for searching meaningful word fragments. 
These practices existed alongside certain rules ofthumb, 
such as narrowing your search early by using pertinent 
proper nouns, chemical names and the like, and being 
careful when using the boolean NOT to avoid 'notting 
out’ what you are really looking for. 

One of the principal early researchers into the 
effectiveness of these techniques, Cyril Cleverdon of 
the Cranfield Business School, wrote to me when I was 
editor of The Intelligent Enterprise about the results of 
his trail-blazing research. 

Writes Cleverdon: 

"There are a number of stages in the process 
of information retrieval, and at each stage 
where intellectual decisions have to be 
taken, experimental evidence indicates that 
there will be a divergence of views. It has 
been found that: 
eif two people or groups of people 
construct a thesaurus in a given subject 
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area, only 60 per cent of the index terms 
may be common to both thesauri; 

ө iftwo experienced indexers index a given 
document using a given thesaurus, only 
30 per cent of the index terms may be in 
common; 

eif two intermediaries [i.e. trained 
practitioners — JW] search the same 
question on the same database on the 
same host, only 40 per cent of the output 
may be retrieved by both searchers; 

e iftwo scientists or engineers are asked to 
judge the relevance ofa given set of docu- 
ments to a given question, the area of 
agreement may not exceed 60 per cent." 

For scientists, this level of uncertainty and 
disagreement is acceptable for at least two reasons. 
First, the new international dial-up online technology 
made it possible to search very large portions of both 
the abstracted, and increasingly the full-text, literature 
all at once within minutes. The speed and efficiency 
with which the literature could be searched for specific 
needles in haystacks was increased vastly at a stroke. 
The second reason is that it is the job of scientists to do 
as much research as possible in order to back test their 
own findings, and evaluate those of others. The role of 
research scientists is to find for themselves something 
new and original. It is not in their interest to agree 
about search terms, and results of searches. Quite 
simply, ifeverybody found exactly the same thing, the 
literature searchers would be out of a job, besides 
human expression in language is so complex, and 
people are so individual (thank God) that Cyril 
Cleverdon's figures are actually quite impressive. 

New techniques and practices have been developed 
since then which purport to use *expert systems' and 
computer controlled relevance ‘weighting’ and 
*relevance ranking' algorithms, both to reduce the 
need for the skilled intermediary online searcher and 
to increase the ‘hit rate’ of searches i.e. pertinence to 
the end user. Most of the so-called expert systems 
which make it possible to use ‘natural language’ to 
search large text database systems are based on thesauri 
which are at best subject to the human ‘Cleverdon 
uncertainty factor' cited above. These systems, of 
which Tome Searcher is exemplary, can be useful so 
long as users understand what they are actually doing, 
which is automatically classifying nouns in sentences 
into a hierarchical table of terms from various thesauri. ' 

The automatic ‘weighting’ and ‘relevance ranking’ 
routines are to my mind a very poor substitute for 
real online searching by intelligent intermediaries. 
Last June Tony Kent wrote to me at The Intelligent 
Enterprise: 

‘I will no doubt be accused of elitism if I 

express my long held view that the 

processes of information management and 
retrieval can never be simplified to a point 
where they may be conducted by half-wits 
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(which is why, incidentally, it is a waste of 

- time and effort to sweat blood building 
pretty user interfaces and the like). Finding 
useful information is an intelligent process 
requiring intelligent people because, at the 
end of the day only the intelligent can 
recognize what is useful.’ 


My interpretation of what he is saying — in slightly 
more polite terms — is that the trouble with automatic 
searching systems is that before any user can find what 
he wants, he must have an understanding of the nature 
and organization (or lack of it) of what he is searching. 
Otherwise, he is incapable of forming an intelligent 
query or search strategy which is where all information 
retrieval starts. 

The founding of the Institute of Information Science 
by Jason Farradane began what I believe to be a very 
unprofitable direction in the history of computer assisted 
information studies and management. ‘Information 
scientists’ began to think that there was a ‘science of 
information’, as opposed to a mere practice of how to 
find and deliver what scientific researchers want. I 
believe that such a conception of information work is 
as absurd as believing that there is a ‘science of law’ or 
a ‘science of politics’. The difference between the 
meaning of the word ‘study’ and ‘science’ is very 
significant. It is quite possible to have ‘cultural studies’, 
‘law studies’ and ‘information studies’. 

Even more pernicious for the practice of information 
management than the misappropriation of the word 
‘science’ was the influence of what has been called 
‘mentalism’ which, according to Bernd Frohmann of 
the University of Western Ontario writing in Aslib’s 
Journal of Documentation conceives of thoughts ‘as 
mental processes occurring in minds’ independent of 
the language used to express them. Concepts are 
conceived as ‘elements of thoughts’ with words acting 
merely as labels of Platonic mental concepts. According 
to this theory, the process of thinking is based on rules 
which ‘operate as internal guidance systems 
determining our actions’. Frohmann cites Cambridge 
Philosopher, Ludwig Wittgenstein in his argument 
against mentalism who wrote that ‘rules are properly 
seen as instruments of practice’ which means that a 
rule can only be understood in the context of ‘the role 
of its practice’ in daily life, not as ‘mysterious agents’ 
determining thought. 

Frohmann concludes that ‘mentalism conceals the 
text’ because it makes the main aim of investigation 
‘the discovery of mental rules of text processing rather 
than structural properties of the text itself’, such as its 
intratextual logical and rhetorical forms, and its 
intertextual relationships where he believes that the 
best clues to indexing can be found. From this point of 
view, the rules of indexing have nothing to do with 
‘truth’ or mental constructs, but are simply the ‘tools 
of social practice’ > 

However, despite the healthy trend by information 
practitioners away from the quasi-scientific cloud of 
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mentalism towards a professional ethos, the question 
still remains whether their techniques and practices — 
which work reasonably well in researching the literature 
for scientists and technicians where the key is to know 
and be able to specify what you are looking for — are 
apposite to the purposes of business information, or 
whether they have actually hindered the development 
of business information systems. 


2. The information practices and needs of policy 
makers and investors 

2.1 Current ad hoc practice 

As yet, no coherent body of practice exists for the 
management of business information outside the albeit 
important area of financial accounting. Every company 
and institution has its own method and policy of 
dealing with information management matters, and 
the tasks associated with information management are 
distributed variously among people with the titles of 
librarian, information scientist, data processing manager 
and secretary — all of whom have completely different 
qualifications, and use quite different technical jargon 
and therefore find it virtually impossible to 
communicate with each other. 

Nevertheless, it is possible to identify a simple 
series of forms of information which are more or less 
common to most companies and institutions. For 
example, it is possible to classify the vast majority of 
corporate information into the following functional 
categories: 

e Letters 
Memos 
Contracts 
Minutes 
Reports 
Media 


To date, the tasks of managing letters, memos and 
contracts has been divided between records 
management specialists and secretaries with local file 
copies, documentation centre copies, circulation copies, 
plus the original copy being sent. The standard way of 
filing this kind of information which has evolved over 
the past couple of hundred years is in filing cabinets by 
date, and/or alphabetically by addressee and/or 
signatory. Reports and media (i.e. newspapers, journals 
and books) have been traditionally classified by 
librarians in alphabetically organized subject and author 
catalogues. 

Once documents have been filed according to one 
of these physical systems, they are generally forgotten. 
If they need to be consulted, they can be found. If a 
manager needs to consult a file, he asks a secretary, 
documentalist, or a librarian to find it for him. The 
information base is entirely passive: nothing happens 
until a question is asked. The end user only asks for an 
amount of information that he can cope with personally 
— which is necessarily a very tiny, almost random 
fragment of the information whole. Most of the 
information stored so carefully by the information 
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professionals is unknown to most of the staff. Another 
factor, contributing to the fragmentation of corporate 
information is the well documented behaviour of 
managers keeping the choice, strategically useful 
intelligence for themselves. 

As for media sources of information, the same 
conditions hold true. People only see random fragments 
of the information totality depending on what they 
happen to read, and to whom they happen to have 
spoken. They are totally blind when it comes to the 
majority of what is being reported as going on out 
there... which is why they are so often unpleasantly 
surprised, and miss opportunities which could be staring 
them in the face, if only they had known. 


2.2 Enter the ‘information scientist’ and the computer 
Leaving aside accounting and: payroll systems and 
despite all the word processing and spreadsheet work, 
the above paper based document management 
procedures have remained the dominant way of doing 
things for the very good reason that it was tried and 
tested. However, two new technologies are now 
becoming known in the business world which could, if 
used intelligently, make document storage and retrieval 
a vastly more efficient operation. 
This new knowledge is based on two ТЕТЕ 
ө that nearly all information is in the form of 
words, or needs to be indexed with words; 
e thatnearly all information is held on documents 
which can be captured and rendered as high 
quality screen facsimiles or computer images. 


The first technology was invented in the late 1960s 
by people like Tony Kent for helping scientists look 
for pertinent citations and is, of course, text retrieval 
software. The other one is newer and goes under the 
name of Document Image Processing (DIP). As is so 
often the case in the world of high technology, the 
developers of each of these technologies have very 
little knowledge ofthe other. However, the integration 
of the two technologies is technically trivial. I speak 
from direct experience. 

However, the real significance of this new level of 
technological awareness among businesses is that its 
implementation will make it plain to executives how 
much information is available at the click of a mouse 
about which they know nothing. At the same time, the 
introduction of the new technology will break down 
the traditional division of labour between data 
processing managers, secretaries, librarians, docu- 
mentalists and ‘information scientists’. Out of this 
fusion will emerge a new information management 
profession, purged, I hope, of the hubris of mentalism. 

So it is now necessary and possible to ask whether 
or not the indexing and retrieval techniques developed 
for the purposes of searching scientific and technical 
data is appropriate to the management of business 
information. It would be well for the information 
scientists to be aware that real end users will have no 
time for the fine points of thesauri and free text 


100 





searching. The ability to search by date, title, author, 
and from lists of subjects is what they are used to, and 
are therefore likely to stick to. Everybody in the text 
retrieval business knows that users, even the so-called 
information scientists, only use a tiny proportion of the 
technological capability available to them. It is even 
possible to conceive that one day all the employees 
will have access to a corporate information directory 
which will both enable them to file and retrieve 
documents, according to a simple set of practices very 
similar to those used now with paper. Over time, more 
sophisticated document indexing strategies will no 
doubt develop, and standard document and indexing 
structures will become an issue of serious concern, as 
ODA (Office Document Architecture) and SGML 
(Standard General Mark-up Language) become widely 
adopted. 

Once in a while, a real expert at searching 
information might be brought in to look fora document 
that the standard queries do not retrieve. But by and 
large, the system will be administered by secretaries, 
documentalists and librarians working together 
according to a standard set of procedures. 


2.3 From passive to activated information 

Although these new technologies and the information 
management techniques which have just been outlined 
will make information storage and retrieval faster and 
possibly cheaper, they do not change the passive 
nature of the corporate information resource. It is only 
possible to find what is specifically sought for. The 
competitive advantage of intelligence comes from 
alerting decision and policy makers to key trends, facts 
and ideas that they had no idea existed, and therefore 
could not ask for. The need is to turn a passive corporate 
information resource into an active information source, 
which not only supplies key bits of intelligence, but 
also keeps users up with the right amount of information 
— from general trends to files of facts — derived from 
the whole spectrum of the corporate information 
resources, It is information refineries which can tum 
passive information residing in text databases and 
document images, or on paper in files, into an active 
resource keeping key personnel continually appraised 
of both strategic intelligence and the wider perspectives. 


3. Building an information refinery 

3.1 Definition 

Refining information is a socio-technological process 
which enables intelligent human beings to extract and 
organize systematically the key items of knowledge 
kept in any given choice of information sources. The 
purpose of the process is to enable people from 
executives downwards to be better and more widely 
informed, while at the same time reducing the amount of 
time they have to spend keeping up with the latest books 
and periodicals... which in tum frees more time for them 
to read what they are really interested in. The result of 
the refining process should be to bring about better, 
more informed decisions by intelligent decision makers. 
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3.2 Origins 

The first serious attempts at using analysts to find 
useful intelligence hidden in a large number of 
published and private sources were undertaken by 
military intelligence services, such as the CIA and 
MIS. Content analysis has also been used in the social 
sciences and in literary criticism since the 1930s. 
Indeed the best book that I have read on the subject was 
written by a classicist, T M Carey who used the 
process to make inferences about cultural trends in 
Greek and Roman times.* 

The first attempts to apply content analysis based 
information refining to current business, social, political 
and environmental issues were the American and 
Canadian Trend Reports which were set up by Kristin 
Shannon and John Naisbitt in the early 1970s. John 
Naisbittpopularized the content analysis methodology 
with his book, Megatrends. I learned the basics of the 
methodology from Kristin Shannon as I became senior 
analyst in charge of Government and Politics, 
Communications and Environment in the late 1970s. 
In the early 1980s, I came to the UK to develop and 
practice ‘third generation’ content analysis which 
exploits the latest computer networking, database and 
display technology expressly to produce more highly 
concentrated, value-added information products. 

Trend Monitor International Ltd. now publishes a 
series of biannual Reports on Computing, Commun- 
ications, Media and Socio- technologies, along with 
other high value products incorporating more expert 
interpretation, and significant cross-links, opportunities 
and dangers to be found outside readers necessarily 
more limited reading range. The aim is provide a 
knowledge edge in order to give them strategic 
advantage when trying to work out the key to 
successfully combining new technology and new 
markets. 


3.3 Principles 

Information refining draws significant inferences from 
large bodies of information. The basic data is how 
events and thoughts are recorded, not the events 
themselves. The content analyst is not writing about 
reality, he is writing about how that reality has been 
portrayed by both reporters and actors as it has been 
recorded. Analysts are trained strictly in order to avoid 
the dangerous phenomenon of the ‘voice problem’ 
where the analyst speaks in his own voice in the guise 
of speaking in the voices of others. 

Although our publications appear to be more fact- 
filled than any other source, we make it plain to our 
readers that we are actually recording perceptions of 
events, not facts, but that because we cover so many 
facts, it becomes possible in commentary mode to 
infer indications ofthe truth, and see developments in 
terms of trends, rather than a random sequence of 
events, 

Although Trend Monitor relies entirely on the 
judgement of highly informed analysts and commen- 
tators, it is fully cognizant that any interpretation of 
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information must add a degree of bias to the picture 
produced. That is why Trend Monitor does not hire 
dogmatic ideologues, but independent thinkers whose 
bias is towards finding and publishing the best, the 
most important information first. 


3.4 Practice 

In order to set up a functioning content analysis unit, it is 
first necessary to develop a content classification schema, 
a time consuming iterative/evolutionary process. The 
diagram shows Trend Monitor's tree-structured content 
classification matrix for Computer Software. 

The first problem a content analyst must deal with 
is to develop a classification schema which models the 
subjectarea ina useful and manageable way. According 
to T F Carey's Content Analysis: A technique for 
systematic inference from communication, a good 
multi-variate classification schema forces the classifier 
—and hence the end user —to ask meaningful questions 
of the text being studied from a selection of different, 
but useful view points.* 

Trend Monitor's classification schema makes a 
top level division between Subjects and Issues. Subjects 
are deemed to be classes of concrete 'actors' ie. 
groups of people, their tools, technologies, materials 
and theenvironment. Issues are abstractions describing 
classes of action which apply across many subjects. 
[See diagram following references.] 

My close colleague, Bob Sprigge writes: 

‘A classification schema is developed top- 
down, then bottom up. It is highly sub- 
jective. From the top down, it is necessary 
to consider what the user would expect and 
comprehend. (Ideally, there would be diff- 
erent schemas for each class of searcher.) 
From the bottom up, the material at hand 
dictates the categories by its quantities’. 


The process is iterative — dividing and sub-dividing 
documents on the basis of the similarities in their 
logical and story content. 

The hierarchical, tree structure of schemas is useful 
for two reasons. First, they are easy for untrained 
searchers to follow. People are taught at a very young 
age to be able to comprehend the difference between 
the general and the specific. Second, they naturally grow. 
Trunks, branches and leaves can be added to reflect the 
appearance of new types of subjects and events. 

On choosing source material Bob Sprigge says, 
‘When using external material it is 
necessary to start with the key publications 
on the subjects, filling in the gaps as they 
are observed. As more publications are 
added, the popular stories are seen from 
more angles, and more generally un- 
reported stories appear." 


Afterscanning sources and classification, the next step 
is to write what we call ‘narratives’ or syntheses 
consisting of descriptions and key extracts from groups 
ofarticles, as opposed to abstracts of single articles. As 
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Bob Sprigge says, ‘Writing the nuggets requires the 
consistent recreation of the context, e.g. country, year, 
issue. This and the continuity imposed by the schema 
ate what distinguishes the output from simple abstracts. 
No longer are isolated nuggets being reported, but a 
systematically told story emerges.’ 

The computer world is spending a great deal of 
money trying to automate the process of inferring what 
is significant from a body of text using word-searching, 
word-weighting and rule based AJ strategies. I believe 
these strategies will ultimately fail—after much precious 
research money has been wasted on trying to get 
machines to do what humans do quite naturally which 
is to judge and evaluate meaning and significance. 


4. Conclusion 

4.1 Significant employment opportunity 
Ifbusiness and government get serious about the value 
of Information Refineries, then demand for people 
who can classify and evaluate different bodies of text 
will increase significantly. The basic qualification ofa 
content analyst is literate intelligence, that is, good 
reading comprehension. The process of doing content 


OTrend-Monitor International Ltd. 


analysis makes its practitioners very knowledgeable, 
very fast about the subjects which they are covering. It 
is both useful to combat information overload in 
advanced industrial economies, and to combat 
information poverty in post-communist and post- 
colonial countries. 

The fact that content analysis is little known, and 
the concept of information refineries is so new makes 
the opportunities all the greater. 
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Virtual documents have become a reality with the widespread use of the technology available. This paper considers 
the impact of the concepts involved in virtual documents in the commercial arena. By reference to practical examples 
where companies have implemented virtual document systems in order to achieve profit and/or corporate advantage 
the paper considers the significance of standards and the direction they will need to take. 


Introduction 

Document — in the sense implied by this conference — 
is defined in the Oxford Dictionary (older version) as 
*Something written, inscribed, etc., which furnishes 
evidence or information upon any subject, as a 
manuscript, title-deed, tomb-stone, coin, picture, etc.’ 
Historically, documents have not been limited to 
manuscripts — indeed frescoes where considered as 
*documents' in 18501. The Bayeux Tapestry is another 
example of a document from history. 

The pointI wish to make is that the term documents, 
virtual or otherwise, may be defined for the purposes 
of this paper as information sources that are made 
available as reconfigurable — or virtual — documents. I 
will differentiate between Compound Documents and 
Virtual Documents on the basis that standards for 
Compound Documents have been proposed and these 
address many of the components of a document: text; 
pixels (for scanned images or raster graphics); 
geometric graphic elements (lines, arcs, polygons etc.); 
and spreadsheets. Virtual documents can also include 
elements such as voice, sound, sequenced images and 
hypertext. There are standards proposed that will cover 
these elements, including Digital's Compound 
Document Architecture, butthese have yetto be adopted 
as an international standard. The Oxford Dictionary 
has a number of definitions2 of ‘Virtual’ including: 

e Possessed of certain physical virtues or 
capabilities; effective in respect of inherent 
natural qualities or powers; capable of exerting 
influence by means of such powers. Now rare... 

e Capable of producing certain effect or result; 
effective, potent, powerful. Obs. 

e That is so in essence or effect, although not 
formally or actually; admitting of being called 
by the name so far as the effect or result is 
concerned. 

While it is the last definition that is commonly 
understood to define ‘Virtual’, I rather like the first two 
definitions as they retain a sense of significance. Virtual 
documents are, by implication, a dynamic concept. 
They represent the idea that data can be restructured, 
combined with other data or enhanced and annotated 


by my own intervention to present data in a specific 
way. This may suit only my own needs or may be 
something I wish to disseminate to others. 

The point of view which I will adopt for this talk is 
to focus on the commercial exploitation of the 
technologies that have been developed for information 
presentation: these use some, and indeed many, of the 
concepts involved in virtual documents. Companies 
involved in this require standards in order to control, 
measure, duplicate and, sometimes, export their product 
processes. They need standards to enable products to 
be supplied to users with acceptable means ofaccessing 
or using them and also to protect their future by 
protecting their investment — not always successfully 
as IBM have demonstrated. Consumers also seek the 
quality assurance and user compatibility that standards 
offer. From this you may gather that I am not an expert 
in virtual documents, and I am not going to expound 
on the leading edge technology in this area. 

My particular background is that of a business 
manager, first in publishing and, later, a software 
development company specializing in text and image 
information systems. This experience will be the basis 
of the examples and ideas developed for this talk. In 
considering the commercial issues it has to be 
recognized that there are many ways in which 
technology is exploited. These can be roughly broken 
down into three areas: 

e developers and suppliers of the hardware and 

software technology; 

ө developers and suppliers of applications based 
on the technology, including entertainment and 
game suppliers as well as commercially targeted 
packages; 

e companies who identify organizational or 
commercial benefit from the adoption of the 
technology. 

Here I have excluded consumer games products as 
consumers of these products do not have a requirement 
by necessity of integrating such products other than 
within themselves. For most business it is this last 
category that, inevitably, determines the success, or 
failure, of technology. Companies and organizations 
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as a whole are only now beginning to consider and 
evaluate virtual document technology. Some 
organizations have been involved in structuring their 
information for a number of years, the majority have 
yet to perceive benefits. Our company, a software 
house, has been involved with a number of 
organizations involved in publishing who have used, 
orare in the process ofusing, visual display technology. 
These range from systems for production of, amongst 
other things, The Guinness Book of Records in its 
various formats to a UN organization which prints 
some one million unique pages per week. 


The past 

BC stands for Before Computers, at least insofar as 
they had no impact on document production — perhaps 
not back to the days of medieval scholars, but to the 
days of typewriters and hot metal typesetting. It was 
with the arrival of computer technology that this world 
changed. So does the past have any relevance to 
standards for management of virtual documents? 

In published documents there is nothing that we 
can do today that could not be done ‘in the past’ — it just 
took more time, greater effort, and tended to be prone 
to error. How many medieval texts are imaginative 
copies of the original? There also existed standards, 
these ‘rules’ determined the layout, style and process 
of publishing. The past is still the rock on which we 
base many of today’s standards. 

The past is not a moment of time, it covers many 
changes. Indeed the technology that was used before 
computers is still being developed. The past exists in 
the present and the issue of compatibility between the 
various technologies is still at an early stage. 
International standards of interoperability remain an 
essential goal. A start has been made. 

Let us dwell for a moment on what that ‘past’ 
meant and The Guinness Book of Records (GBR) will 
serve as a good example. In 1986 our company carried 
out a consultancy for Guinness on computerizing their 
publications; their primary emphasis was and is the 
GBR. Considering how much revenue it earns this is 
hardly surprising. 

The Guinness Book of Records is a book of 
: reference. The editorial team is surprisingly small: in 
1986 it consisted of six people. Supporting their work 
isa large number of specialists who supply information 
on particular topics: astronomy; zoology; physics; 
sports etc. Much of the work carried out by the staff 
prior to going to production was requesting and collating 
copy from these specialists. In addition they supported 
over 40 foreign editorial teams producing the GBR in 
their own language and adding their local records. 

The production was cut-and-paste, sending off 
galleys to be typeset, proofed and late amendments 
made by hand onto the proofs, a procedure that adds 
considerably to costs. This manual process was 
common at the time and is still a widespread practice. 
The problems for their publication were based around 
the fact that the information contained was continually 
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being updated. The physical time of despatching copy, 
having it typeset, returned, proofed and updated, goes 
on and on until it’s so late you have to go to print. The 
danger was in managing the incoming data and ensuring 
its inclusion once authorized. 1 do not think I am 
breaching any confidentiality in saying that a number 
of ‘achievements’ (itis difficult to avoid saying records) 
were mislaid and never made it into the book in the 
confusion of production. The other significant problem, 
probably more prevalent in their situation than most, 
was handling enquiries. This would range from young 
‘Johnny’ ringing up to ask whether his cat was the 
heaviest in the world, to a serious record attempt being 
considered and the enquirer wanted to know the current 
record and the rules. While ‘little Johnny’ would be 
satisfied with a direct quote from the last edition, those 
involved in a serious record attempt would not be ifa 
more recent record had been established but not yet 
published (apart from the Warhol ‘15 minutes of fame 
syndrome’, some people gain financial benefit from 
being recognized in the book). The sheer volume of 
paper, pending records, claims and general corres- 
pondence made answering these inquiries very time 
consuming. Our study suggested 60% of time was 
spent on such enquiries, 10% on administrative activity, 
10% onclaim verification and 20% on editorial activity. 

Having identified their problems we then developed 
a strategy for them. The first stage of which was to stay 
with the past but introduce typesetting technology in- 
house. This could be justified commercially simply on 
cost and time savings — it also kept the company within 
the culture its staff knew and introduced keyboard and 
screens to them. It was the change of culture associated 
with a new technology that was identified as the 
greatest problem to be overcome. The real challenge 


. was to make the information available for use in a 


number of different ways. To have the ability to produce 
a document or information relevant to each enquiry 
there was a need for a virtual document system. They 
had to move to a computer based information зе 
providing their data in electronic form. 


The present 
For me the present started in 1980 when I became 
involved in publishing a directory from a database. : 
The hierarchical database introduced me to the 
structured nature of documents. The legal directory 
consisted of some 15,000 practices employing over 
50,000 qualified solicitors. The publication had been 
hot metal set for over 140 years. The first database was 
a disaster. In effect it simply typeset the publication 
froma database that was maintained externally. Unlike 
the Guinness situation, this solution offered no real 
commerciàl benefit and the company abandoned it 
after eighteen months. We replaced it with à new 
database facility with in-house data entry and output. 
The advantages were enormous and effective within 
the first few months. Data forms for both firms and 
individuals could be generated directly, information 
received could be reflected in the database and allowed 
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for immediate and accurate response to enquiries. 
However, the true motivation for the development was 
achange to the directory structure. The main directory 
was to be enhanced with additional information on 
firms, individuals were verified by The Law Society, 
and a separate publication of 28 regional listings, was 
produced. The two directories, the main legal directory 
and the Regional Directories, were published from the 
same database but in both presentation, organization 
and detailed information they were very different. 
Also a new commercial opportunity had been opened 
up with the sales of high value by-products at only 
marginal cost. We had moved into the world of virtual 
documents and one generating very real profit. 

Having implemented the technology for production 
other opportunities become apparent: the technology 
developments associated with document management 
generate new market opportunities; electronic 
publishing mediums such as CD-ROM become product 
options, although yet to be significant revenue earners 
other than in niche areas. The technology has become 
an important factor in assessing business strategy. The 
technology has to be viewed as potentially both the 
means to achieve production and the product concept 
itself. It was consideration of these factors that led 
Guinness to move beyond the Phase 1 typesetting 
system. 

Guinness Publishing installed a database to manage 
the status of the entries. Updating and editing became 
a continuous procedure rather than an annual event. 
Associated and background information was also held 
on the database. At any point of time the database held 
an up-to-date version of The Guinness Book of Records. 
The first commercial use of this database was to 
generate the data for production of the Guinness Disk 
of Records -а CD-ROM version of the book. Last year 
Guinness sold developments rights for a СОТУ, CDI 
and two CD-ROM versions of the book. It should be 
noted that Guinness does not itself develop the 
multimedia products, they do not yet perceive the 
financial rewards to be worth the investment in a 
technology area thatis rapidly changing and which has 
yetto define the ultimate standard: which medium will 


become internationally successful in the consumer. 


market. 

There is one other aspect of the Guinness database 
that is relevant here. In 1989 Guinness purchased back 
the US publishing rights to the book. Previously they 
had published a UK edition only and a UK addition 
with an Australian supplement and all overseas 
publications were licensed. Sales in the US were not 
satisfactory. The database was adapted to enable the 
typesetting of both UK and US publications from a 
single database. This involved: 

_@ ‘translating’ Anglo-English to American- 
English; 

. @ allowing UK local entries and US local entries 
(as opposed to World Achievements); 

e for Americanization of entries having particular 

interest in the US; 
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e holding measurements in both metric form and 
UK-Imperial or US-Imperial where they differ 
and outputting them in appropriate formats. 


These ‘translations’ were made on-the-fly, the US 
editors called up their relevant entries and saw them in 
the Americanized form. On saving entries they were 
stored in Anglo-English inthe database. The US entries 
were truly virtual, they only existed in their particular 
form when output to the screen or paper. The cost 
savings were significant, for a start, instead of a 
duplicate US editorial team, they required one full 
time editor plus a part-time assistant during production. 


A general mark-up language approximating to 
international standards underlay the database system 
so that the typesetting codes etc. can be output in a 
specific format when addressing the various devices or 
production requirements: various printers for proofing; 
the typesetting system; electronic publishing etc. 

Itshould be noted that Guinness and the publishing 
industry is only one example of the interest in virtual 
documents. One client is à personnel recruitment 
consultancy — they manually enter the curricula vitae 
received into a database that enables a standard CV 
layout to be generated. They also use the database to 
produce abstracts from the CVs for clients with 
particular requirements. 


Another area with a different requirement is the 
law. In commercial litigation literally thousands of 
documents may be collected for the purpose of 
litigation. A high percentage will have no relevance to 
the final trial, if matters go that far, but the lawyers 
cannot be sure which ones will have a bearing. We 
have developed a Litigation Support System that 
captures the documents as images and also uses OCR 
to capture the text content. The case will ultimately 
rely upon the selection of documents and affidavits 
supplied to the court; each selection of documents is a 
separate document in its own right. Prior to trial the 
lawyers acting for the opposing sides will have 
submitted various sets to each other. The order of 
presentation of these documents is often of significance, 
the lawyer supplying wants to conceal the line of 
argument as much as possible yet is obliged to reveal 
the evidence. It is the sequencing of documents that is 
often used to obsfucate the logic, each set is itself a 
virtual document within which the relevant information 
exists. We estimated the productivity gains for a senior 
lawyer are over 60% on document management, time 
spent in retrieving, searching and copying documents. 
At a chargeable fee rate of over £200 per hour to a 
client you can see the benefits that could be achieved 
by a legal firm. 


Other areas that have perceived the benefits of this 


technology include a UN organization for its complex 
personnel records as they apply to an individual any 


‘time in the last ten years; technical manuals for safety 


critical operations such as ой platforms; commodity 
shipping; and private banks for client portfolio 
management. | 
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The future, and some standards for making it work 
To return to Guinness and other publishers. Where do 
they seem to going? One direction is towards what 
Seybold have called the Fourth Wave Publishing 
systems. Simply put, this is integrating typesetting 
systems fully into a database, or similar standard 
retrieval system, so that the front end of the database is 
the typesetting system. In other words the storage 
mechanism is independent of the presentation system. 
This allows for final amendments to be automatically 
recorded into the database. Currently data flows from 
the database into a typesetting system, any changes 
entered into the typeset document have to be re- 
entered into the database. This approach could well 
become a de facto standard for virtual document 
systems. < 

Electronic dissemination of information, in 
whatever form that information/data exists, is not only 
about delivering documents but also about taking 
information from one environment and placing it in 
another. The recipient may disassemble that data and 
create a new document. The future is, in part, with us 
now. To fully realize an environment within which any 
and all data repositories, containing image, sound, 
text, spreadsheets and video, seamlessly combined 
will require a raft of internationally adopted standards 
for software packages to conform to. 

Standards will need to cover the following: 

@ storage; 

ө retrieval; 

e output form; 

e filters; 

e presentation form. 


Whilst Relational Database Management Systems 
(RDBMS) will continue to be the most common storage 
standard appropriate for virtual documents, any widely 
installed standard virtual documents will also need to 
address other DBMSs, object oriented databases, and 
flat file storage systems and SGML tagged data. The 
greatest challenge will be the development of protocols 
allowing data to be stored in a truly heterogeneous 
network, yet seamlessly accessible from any machine, 
without regard for the operating system. The data itself 
will need to be held in a standard that contains the 
content, structure and appearance ofthe data elements, 
allowing presentation to be adapted for the appropriate 
media. 

SQL does seem to be the de facto standard for 
retrieval. The extensions to the ANSI standard that 
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will allow text retrieval needs to be extended to create 
standard syntax for retrieval of all data types. 

Will SGML or ODA become an internationally 
adopted standard? These standards were developed 
for the representation and interchange of structured 
documents’, and are specifically designed to allow for 
subsequent editing. Standard Generalised Mark-up 
Language (SGML) differs from the Office Document 
Architecture (ODA) primarily in that SGML is 
essentially a syntax for describing the logical structure 
of documents without any syntax concerning the layout 
or appearance of the document. ODA also provides a 
model for describing documents with a clear separation 
of structure and content. ODA would seem to be the 
more appropriate standard for multimedia documents. 
Hypertext facilities, discussed later today, do not 
currently have an equivalent standard. Yet fora standard 
to ultimately be useful it must be widely adopted. 
SGML is, in principle, a simpler standard, and also has 
the advantage that SGML tagged documents can be 
read in raw form. ODA is somewhat more complex, 
and as the standard is enhanced to address issues such 
as cross referencing it will become more difficult for 
casual users to implement. 

Filters, or protocol converters, provide software 
which sits between packages which are proprietary, 
and allows their output to be compatible. A typical 
example would be Carousel which ‘distils’ Postscript 
code and creates a generic Postscript conversion for all 
Postscript output devices. Such software products could 
provide the basis of an intermediary standard for data 
transfer, allowing proprietary packages to continue in 
use. Ultimately, the successful standard will be defined 
by the biggest players in the software market, the word 
processing package suppliers and be elected with a 
majority decision by the users. 
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“Two GREAT INSTITUTIONS 
COMBINE TO FORM A 
NEW FORCE IN 
BIOMEDICAL INFORMATION 


n March ist 1993 the Bureau of Hygiene and Tropical Diseases 
became part of CAB International, bringing together two of the 


world’s leading publishers of scientific and biomedical information. 


Established in 1908, the Bureau is one of the world’s oldest and most 





authoritative sources of information on public health, tropical and 
communicable diseases and their control. Its bibliographic database, 
compiled from over 1,200 sources is available online through DIMDI, 
and printed journals and newsletters include Tropical Diseases Bulletin, 
Abstracts of Hygiene and Communicable Diseases, Current Aids Literature, 
Aids Newsletter and Public Health News. 


These titles will enhance CAB International’s existing range of 
biomedical publications and new electronic products are being planned 
to complement the current range of CD-ROM’s, notably CABCD (all 
of CAB ABSTRACTS on CD-ROM) and the CAB Spectrum range of 
specialized CD-ROM databases including VETCD, CABPESTCD, 
TREECD, HORTCD, BEASTCD and SOILCD. 


For further information on all CAB International printed and electronic 


information products, please contact the Marketing Department. 
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North America 
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Engineers’ guide to product information: sources and use 
Edited by Raymond A Wall 


214 x 136mm; 1992; xvi, 250pp 
0 85142 291 8 hardback 

UK and Europe: £50 

Rest of the World: £59 


A new guide on sources and methods of finding product information for engineers and for associated R & D and 
purchasing personnel. The needs of professionals in all branches of mechanical, electrical and electronic engi- 
. neering are covered. 


Handbook of special librarianship and information work 
6th edition 
Edited by Patti Dossett 


212 x 132mm; 1992; viii, 558pp 
0 85142 269 1 hardback 

UK & Europe: £64 

Rest of the World: £77 


1, . . an excellent summary of the information profession’ - British Book News 


This essential reference work has been expanded and updated to reflect the new challenges facing special 
librarians and information managers. Junior staff wanting an introduction or an overview, or experienced 


. 


managers seeking current thinking оп a specific area will find the handbook incomparable. 


Information management: from strategies to action 2 
Edited by Blaise Cronin 


234 x 156mm; vi, 222pp 
0 85142 281 О hardback 
UK and Europe: £36 

Rest of the World: £44 


Of vital interest to all who relate fo information management, Professor Cronin’s collection of essays reveals the 
wide-ranging nature of the field. A wealth of Draciicel experiance coupled with keen insight into current develop- 
ments in a vo of organizational contexts and environments make this collection of prime importance to 
practitioners and academics alike. 


Marketing of library and information services 2 
Edited by Blaise Cronin 


234 x 156; xiii, 604pp 

0 85142 278 0 hardback 
UK and Europe: £54 

Rest of the World: £65 


A detailed and comprehensive introduction to the topic aimed at students, entry-level practitioners and informa- 
tion workers, the чү expert contributors offer ол for everyone. In a market-driven world Professor Cronin 
has focused on dealing with the economic forces that motivate the end-users and how both theory and practice 
are evolving to meet their needs. 
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Abstract 


As Geographical Information System (GIS) databases have become progressively larger and more complex, and 
increasingly unconstrained by traditional geographical boundaries, data management and quality have emerged as 
critical issues in GIS projects. GIS technology can be considered a table-top on which data are compiled, considered, 
manipulated and located. With reference to the development of a geographic database for a loosely defined sub- 
region of the South East of England — the East Thames Corridor — the problems associated with obtaining data of an 
appropriate standard from within the institutional framework of local government, and the data quality problems 
inherent in the development of an integrated database are considered. 


Introduction 
Data quality in an integrated database 
It is a commonly accepted that integrating data from 
different sources leads to ‘added value’ and offers 
considerable scope for more general use of data which 
would otherwise be restricted to a limited number of 
organizations. True data integration, however, remains 
an elusive target because different users create 
: compatible data sets differently. Data are maintained 
in diverse locations at different scales and accuracies, 
using different software on a variety of hardware 
platforms. Database uncertainty increases as data sets 
of different origin, and therefore different data history 
and quality, are combined. Analysis carried out on an 
integrated database can ultimately only be as accurate 
as the quality of the ‘worst’ data set included. Data 
combination is, however, necessary as the type of 
management objectives and decision-making problems 
dealt with by planning agencies in particular, often 
transcend functional and organizational boundaries 
and require an array of diverse data sets. In this context, 
local government is a very important source of data for 
social and economic planning and land management 
in the United Kingdom, yet it remains an environment 
typified by an absence of data standards, inconsistency 
in data collection techniques, problems concerning 
data ownership and control of access to data, and 
reservations about releasing data to other users 
(Leighton and Kutsal'). Acting to exacerbate the 
problem is the fact that the data required to compile a 
planning and land management database are often not 
collected at all, and ifthey are collected, they are often 
in a form which is not readily usable. 

This paper does not intend to examine the data 
quality issues inherent in conceptual, spatial and 
attribute uncertainty (Becard’) as these are discussed 
extensively elsewhere (eg. Chrisham’, Coward and 
Heywood‘, Dangermond”. It is instead our intention to 
discuss the data quality issues which may arise as a 
result of compiling a database from a variety of different 
data sources, which has been likened to compiling a 
GIS database ‘jigsaw’. 


The East Thames Corridor project 

Many of the technical and institutional issues which 
influence the data quality of an integrated database 
have been addressed during the development of a land 
and economic information system for the East Thames 
Corridor (figure 1), a project under the Economic and 
Social Research Council/Natural Environment 
Research Council Collaborative Programme on 
Geographical Information Handling. The development 
of the East Thames Corridor (ETC) GIS was undertaken 
by the South East Regional Research Laboratory 
(SERRL), in association with the London and South 
East Regional Planning Conference (SERPLAN), in 
response to the need for information coordination and 
dissemination to improve planning and facilitate 
development inthe ETC. A principal theme of regional 
planning guidance for the South East of England is the 
need to redress the economic imbalance between the 
eastern and western sides of the region. SERPLAN 
identified the ETC as the principal area in the east of 
England in need of and with scope for urban renewal 
and development (SERPLAN £, 7), and a Deloitte 
Haskins and Sells report identified the ETCs lack of 
identity and the absence of up-to-date and well- 
presented information, as two of the main factors 
mitigating against development and environmental 
improvement in the corridor (SERPLAN?) 

The data quality issues relevant to an integrated 
database can be readily addressed with reference to the 
ETC project due to the unique geographical 
characteristics ofthe area (figure 2). The ETC does not 
conform to existing administrative boundaries. It incor- 
porates within its borders a markedly varied physical 
environment, ranging from intense urban development 
in East London, a large area of Metropolitan Green 
Belt, mineral workings and agricultural land, through 
to 29 Sites of Special Significance (SSSI) and the mud 
flats and salt marshes of the Thames Estuary. It also 
contains parts of 10 London boroughs, 11 district 
councils, 2 county councils, the London Docklands 
Development Corporation (LDDC) and is of interest to 
anumber ofenvironmental and local political pressure 
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groups. This lack of profile, the complexity of the 
environment and the fragmented nature of local 
authority data has had a two-fold effect on the corridor. 
Not only has it hindered development and environ- 
mental improvement due to a lack of coordinated 
decision making, but in the past it has also hindered the 
establishment of a comprehensive information source 
necessary for the promotion of investment and 
regeneration. The challenge is to see if an integrated 
information system can create a 'profile' for the corridor. 
The ETC is also a significant GIS research area 
because of its potential importance in Europe. It is a 
rapidly changing area with enormous development 
potential, including within its boundaries the route for 
the Channel Tunnel high speed rail link to London. 
The Secretary of State for Trade and Industry, Michael 
Heseltine, envisages the development ofnew industry, 
housing and education facilities within the corridor, 
and recognizes that its position makes it an area of key 
significance for communications, linking not only 
London and the South East but the Midlands and the 
North with Europe. 
‘There is no urban project in the world 
that should command more notice and 
imagination’. 
(Michael Heseltine speaking at the annual 
LWT London Lecture, 12 December 1991°). 


Significantly, however, no sub-regional authority 
exists to coordinate the planned changes, or the 
development and environmental improvement initiative 
already in place. 


The aims of the ETC project were basically 
threefold, although there were a number of specific 
objectives associated with each: 

e develop a database which is accessible through 

GIS technology, 

e enhance understanding of the institutional 
requirements for handling a multi-authority, 
multi-subject database, 

e apply the database in order to carry out research 
on the determinants of land-use change and 
employment generation inthe ETC, and generate 
profiles of the ETC at a variety of local scales 
that are useful to planners and developers. 


The remit of the ETC project to address such a 
diverse range of issues, determined the need for a wide 
ranging database and the integration of a number of 
different data sets relevant to planning and land 
management. Thus the project involved the integration 
of data derived from a variety of sources (figure 3), 
obtained in different formats and relating to different 
spatial scales. As a consequence of these factors one of 
the main themes of the ensuing research was an 
investigation of the problems associated with 
integrating spatially related digital data derived from a 
variety of government and non-government sources. 
Data quality issues arose because much of the data for 
such an information system had to be derived from 
public data sources, and given the 'state of the art' of 
digital data in the public domain, they are inevitably 
problematic. ETC project data sets focus on 
administrative areas, population distributions, 
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Figure 1. The Counties of South East England and the East Thames Corridor. 
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Figure 2. Geographical Character of the East Thames Corridor. 
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employment and unemployment, land use change, 
travel-to-work patterns, the natural and built 
environment, infrastructure and proposed infra- 
structural developments, and sites of development 
constraint and potential. These data are held at one or 
more spatial extent, ranging from the South East of 
England to the East Thames Corridor itself. 


Institutional constraints to data integration 
Introduction 

Technological progress and the reduction of hardware 
and software acquisition costs have removed many of 
the barriers which have in the past restricted the 
development of GIS in local government in the United 
Kingdom. The actual introduction of GIS technology, 
however, involves managing change within 
environments which are typified by uncertainty, 
entrenched institutional procedures and individual 
staff members with conflicting personal motivations 
(Campbell and Маввег!9), The capture, maintenance, 
quality, testing and sharing of digital data by local 
authorities are afflicted by the same constraints. Data 
collection procedures, for example, are often delayed 
in organizational committees, may be piecemeal due 
to a lack of understanding by some of those involved 
(with involvement determined by established council 
hierarchies), and commonly are not comprehensive 
due to indecisiveness and uncertainty about resource 
needs. There also exists a caution about providing 
data to other users both within councils and between 
councils, and in particular to outside agencies. This 
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Figure 3. Information Flows in the ETC-GIS Project. 
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data 'protectiveness' stems from a false notion of data 
ownership present in most tiers of government. 
Openshaw et al noted that this caution about providing 
‘public’ data to outsiders is endemic in the UK and 
concluded that the results for research and the adoption 
ofrelevant GIS techniques are potentially catastrophic. 
They saw the reasons for users failing to exploit many 
existing data sets as including: 

e ап over-emphasis on confidentiality, enshrined 
in outdated laws, 
unrealistic belief in data quality, 
reluctance of separate departments to lose control 
of ‘their’ data, 
technical difficulties of integrating data sets 
organized incompatibly, 

e lack of resources and expertise, and 

e а lack of traditions in identifying new needs. 

Yetlocal government is, and will continue to be, an 
invaluable (and in many instances the only) source of 
social and economic data for planners, land managers, 
the social services and central government. 


Local government interviews 

In order to assess the problems associated with 
extracting data ‘pieces’ from local government and 
obtaining these data in appropriate formats for 
integration, two surveys were undertaken. The first 
survey was directed at the planning officer(s) concerned 
with spatial information in each ofthe 23 local author- 
ities which fall within the ETC (19 local authorities, 2 
county councils, the London Docklands Development 
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Corporation (LDDC) and the London Planning 
Advisory Committee (LPAC)). Itinvestigated attitudes 
towards spatial information and GIS technology, the 
use and format of spatial information, data collection 
methods, the degree to which each authority had made 
progress towards implementing GIS technology, user 
needs and user awareness of data quality requirements. 
The second survey was a case study which built upon 
the first by addressing more directly the issues of data 
sharing and data integration within those authorities 
which are currently implementing GIS technology 
(Lopez?*). Figure 4 represents a simplification of the 
planning related information flows identified as 
occurring between the various tiers of government in 
the ETC. From this analysis it was possible to assess 
the institutional constraints to the development of a 
sub-regional multi-discipline database for the corridor. 
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Of the authorities interviewed in the first survey 
68% had carried out assessments of GIS and were 
considering implementation, 23% had acquired GIS, 
23% were planning pilot studies, and one council was 
planning future implementation. These results are 
remarkably similar to the figures obtained by Campbell 
and Masser!’ in their GIS survey of the 514 local 
authorities in Britain (table 1). 

The officers interviewed in the ETC surveys 
generally agreed that the main incentives towards 
adopting GIS technology were its ability to carry out 
new tasks and increase productivity, pressure for 
better information and the fact that the spatial nature 
of so much local authority data would make GIS 
efficient. The main forces acting to discourage the 
adoption of GIS were seen to be a lack of money, 
a lack of computing resources, the fact that 
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Figure 4. Planning Related Information Flows in Local Government within the ETC.. 
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GIS implementation is often time consuming and the 
poor quality and availability of data. 


GIS and decision-making 

Responses obtained during the surveys indicated that 
there is a high level of GIS awareness in local 
government. From our analysis of attitudes towards 
GIS technology, however, it was apparent that local 
government personnel in the ETC are not commonly 
aware of its full capacity for use and in turn, therefore, 
cannot be aware of the data quality requirements that 
would make their digital data ‘useful’ to other users. 
Interview respondents over-whelmingly felt that policy 
formulation, implementation and evaluation would 
benefit more than any other activities from GIS 
implementation. Yet while 6096 could envisage GIS 
assisting day-to-day operations decision-making (in 
tasks such as local land searches, accident monitoring 
or social service work load planning), only 3596 could 
envisage it assisting longer term strategic decision- 
making, anda significant proportion ofthe respondents 
in this instance could not give examples of such uses, 
gave confused answers as to what strategic decision- 
making via GIS would entail, and/or appeared not to 
have considered such issues until the question was put 
to them. Campbell and Masser” similarly concluded 
that given the emphasis which has been placed on the 
contribution of GIS to improved strategic decision- 
making, it is striking that staff in local authorities 
associate such systems with the more straightforward 
activities of improving information processing 
facilities. 

The conclusion which can be drawn from these 
responses, is that if the agencies which act as data 
collectors for digital databases are unaware of the full 
potential for use of digital databases, they will similarly 
be unaware of their data needs and data quality 
requirements. It is apparent that a change in staff 
attitudes towards digital data is required, nct only to 
implement GIS effectively, but to ensure the collection 
and integration of appropriate data. Most of the GIS 
managers interviewed (7596) recognized the need for 
more information technology awareness in chief 
officers, members, planning officers and technical 
staff in order to effectively develop GIS (however, 
only 43% felt there was a need for training to promote 
awareness). Almost 8096 of the respondents also felt 
that it would be necessary to improve their data 
collection techniques in order to improve their spatially 
referenced databases. The most commonly quoted 
changed envisaged as being necessary included a need 
to increase the scale of analysis, a requirement for 
more regular survey updates, the acquisition data sets 
and a critical need for improved data accuracy and data 
capture techniques. 


Data 'ownership' 

Data confidentiality and data protection also pose 
significant problems for the extraction of useful data 
from within institutional systems. The UK employment 
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statistical series exemplifies the problems arising from 
legal codification of certain confidentiality guarantees 
(Coombes"*). For example, in the case of integrating 
employment data into the ETC database, although 
these data are available at the ward level, due to data 
confidentiality users are required to amalgamate the 
information to a larger unit in order to ‘present’ it. The 
issue was resolved in the case of the ETC database by 
defining areas of unique labour market characteristics 
within the corridor and amalgamating the data to these 
units (figure 5). This solution, however, is notconducive 
to data integration as our base unit for analysis (and 
data collection unit for other such group statistical data 
such as population and unemployment data) was the 
ward. 

Coincident with the problem of confidentiality is 
the issue of data protection, the caution that exists 
about providing data to other users. While the need for 
data in machine readable formats is currently driving 
interest in data exchange in the ETC, there is in fact 
little transfer of digital data between or within local 
government agencies. Of the 23 GIS managers (or 
management groups) interviewed, 6496 did not have 
access to spatially referenced data held in other 
departments that they would like access to, and only 
50% expected other departments within their authority 
to provide data to their department (and this figure is 
reduced to 3596 when those respondents who were 
speculating about possible future exchange are 
excluded). 

GIS managers considered the major constraints to 
digital data sharing between and within local authorities 
to be: 

e departmental data are not held digitally, 

e information exchange had not progressed to 

that point technologically, 

ө distrust of ‘outside’ use of departmental data or 
fear of competition as a result, 

e personnel were unaware of data available in 
other departments, 

e lack of incentives to exchange data, (in fact, 
localauthority directives were in place inhibiting 
direct access to data), 

€ concerns over information privacy. 


Departmental data are not held digitally. Interview 
respondents claimed that there is very little meaningful 
information availablé which can be easily integrated 
into their databases. Although there have been some 
attempts to develop corporate-wide GIS in the ETC, 
departments generally collect and store data in distinct 
ways for distinct purposes which may not meet the 
requirements of other users. 


Distrust of ‘outside’ use of departmental data or fear 
of competition as a result. Differences in departmental 
data requirements ofthe same data set, and the potential 
for misuse of information when data are used for 
secondary analysis, were seen to be strong arguments 
for limiting data sharing. Local government officers 
were also wary that the exchange of information, both 
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within an authority and with outside agencies, could 
leave their plans and operations open to examination, 
which in turn could prejudice their interests. 


Lack of incentives to exchange data. In many cases 
departmental users will not share their data because 
directives exist which forbid them to do so. More 
often, however, isolationist departmental attitudes 
prevent data exchange. It is likely that this situation 
will be heightened with the spread of Compulsory 
Competitive Tendering (CCT), which may further 
constrain existing data sharing arrangements. 
Interviewees indicated that departments were 
increasingly concerned that data should only be used 
to meet their own performance targets in an environment 
where they were competing for increasingly scarce 
resources. Emerging EC legislation (such as the 
European Directive on Freedom of Access to 
Information оп the Environment, CEC"), however, 
may encourage, if not promulgate, more cpen data 
access and data sharing policies. 


Information privacy concerns. Approximately 10% of 
the spatial databases which were investigated contain 
private information about individuals. Although local 
authorities have regulations preventing the release of 
confidential information, they seem to raise more 
questions than they resolve (Harrison!^). What is lacking 
is a clear information policy defining which databases, 
or portions of databases, are to remain confidential. 
Without such clearly defined policy, data custodians 
will tend to be reluctant to release any data about 
individuals. 


Although technology provides new, alternative 
and efficient means to use data, if the availability of 
information products are limited because of access 
problems, the potential of GIS will not be fully realized. 
Traditional departmental isolationist tendencies prevail 
in local government by resisting, not addressing, the 
corporate information strategy efforts that would 
enhance GIS implementation. Re-enforcing this trend 
is the fact that the current emphasis in local government 
management style is on the existence of cost centres 
and favours the demise of corporate planning. Most 
GIS managers, however, are now becoming aware that 

‘constraints to the flow of information will prevent 
future data integration and will create serious obstacles 
to GIS development. Also running counter to this 
resistance to reform, are the data sharing initiatives set 
iti place by the introduction of the Community Charge. 
These initiatives have forced departments which hold 
information concerning the location of individuals to 
standardize their address lists and pool their data. 


An information strategy 

The lack of information policies which clearly define 
relevant data standards, maintenance schedules and 
data sharing protocols within many of the local 
authorities implementing GIS in the ETC, has led to 
the creation of departmental databases and ‘information. 
islands’ within departments which cannot easily share 
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data with one another or outside agencies. Existing 
GIS strategies in the ETC tend to focus on the 
acquisition of hardware and software, rather than on 
the institutional and data related concerns necessary 
for successful implementation and effective information 
sharing. As a result none of the interviewed local 
authority departments currently undertaking GIS 
development, in some cases after more that three 
years, have a fully operational GIS in place. The 
establishment of workable data handling policies are 
essential for effective data exchange. 


Data integration: maintaining data integrity 

Data availability 

Data availability will inevitably vary from case to case 

and from country to country. The difficulties associated 

with obtaining data from within an institutional 

framework, however, will be exacerbated when an 

area of interest crosses administrative boundaries. 
Within the lower tiers of government data are 

collected, analysed, assembled and represented by 


specific agencies each with their own statutory ( or 


administrative) responsibilities. In most cases, local 
authority departments act individually and rationally 
to design databases that contribute optimally to these 
activities. In some cases this has lead to a fear of 
‘corporate GIS’ or data coordination efforts, because it 
is believed that these initiatives may diminish a 
department’s ability to fulfil its role. The result is the 
development of different maps and databases within 
authorities, and from authority to authority, which 
relate to the same or compatible data, with distinct 
features, scales, attributes and accuracies. This often 
(and almost inevitably) leads to data quality 
inconsistency, data incompatibility and, in the worst 
case, incomplete data coverage. Thus any attempted 
aggregation of county level data, for example, will 
encounter (among other things) data with non-uniform 
attribute detail and varying data collection time frames. 

In the ETC population data by ward for intra- 
Census years is collected and/or forecast by different 
organizations in different areas of the corridor (the 
county councils make forecasts for the districts and 
LPAC produces forecasts for the London boroughs). 
The result is that the different organizations made 
population forecasts for different years using different 
forecasting techniques. Since 1981, therefore, there 
are only three years for which there are population 
forecasts for the whole corridor, and once obtained 
these data were all in different formats. The preliminary 
processing then required to strip out the necessary data 
and convert them into an appropriate format for 
inclusion into a digital database was extensive. 
Furthermore due to differential ward name spelling, in 
order to attach population data to the wards in the ETC 
GIS, all the ward names had to be checked and 
standardized. Then subsequently, due to ward name 
duplication in the corridor, checked a third time to 
ensure data allocation to the correct wards. Thus while 
GIS provides the geographer with the flexibility to 
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move across traditional boundaries and view diverse 
areas simultaneously, there remain artificial borders in 
data. 

In this context it is also worth noting that due to the 
inadequacies of nationally available statistical series 
and the lack of accepted design criteria for planning- 
orientated information systems (Worrall!5), local 
authorities are beginning to develop their own unique 
solutions to the problem of data availability, periodicity 
and quality, which acts to further preclude the likelihood 
of data integration between organizations. 


Data conversion 
The next aspect of creating an integrated database is 
then data conversion. This embraces a broad spectrum 
of problems concerned with the transfer of geographic 
data from one system to another. The technical 
problems, however, have largely been overcome, as 
the proliferation of GIS and Database Management 
Systems (DBMS) from a variety of vendors has 
increased the demand for facilities to transfer data 
swiftly and reliably between systems. Unfortunately 
while the development of these facilities has on the 
one hand done much to force the issue of data standards 
and formats, it can on the other lead to a neglect ofthe 
issue of data quality due to the ease with which data 
translations can be implemented. A major problem 
with GIS is that it is becoming too easy to compare 
incompatible data, to produce apparently correct, but 
technically inconsistent results (Charlton and Ellis”). 
The technological improvements which have 
greatly enhanced the means of data transfer between 
information systems are only relevant, however, when 
data are in a digital form. In the ETC much of the local 
government data relevant to social and economic 
planning remain in hard copy form. In 1991 44% ofthe 
planning departments interviewed still exclusively 
transferred their data on paper. The technological 
improvements are also only relevant when the data 
recipients have the skills and computer resources to 
transfer data between digital systems. Of the officers 
interviewed during our surveys 6096 noted that data 
transfer was hindered in their organizations by a lack 
of skill and a lack of necessary hardware and software 
facilities. 


Data integration 

True data integration is elusive because different users 
create compatible data sets differently. Objects are 
locationally referenced in different ways, to different 
standards of accuracy and uncertainty, at a number of 
different scales in different time frames. This leads to 
two contrasting types ofproblem, technical difficulties 
concerned with the actual integration of data sets arise 
and, more significantly, questions of data quality arise 
due to the differential accuracy and uncertainty 
associated with the various input data sets. The latter 
problem is of particular importance in projects where 
data are obtained from non-commercial sources, and 
its significance increases exponentially with the number 


April 1993, Aslib Proceedings 


of different organizations from which data are obtained. 
Furthermore, the issue of data integration and exchange 
inacross-agency GIS project, such as the ETC project, 
is distinctly more complicated, time-consuming and 
unpredictable than in a single purpose local area study. 

The technical difficulties common to integrating 
data sets can, in most cases, be overcome; the problem 
then becomes the time and resources required to 
overcome them. One of the most significant technical 
limitations to integrating data sets results from 
differences in spatial referencing systems. Openshaw’! 
points out that this discrepancy often occurs because 
many GIS users start building their systems taking for 
granted that the spatial referencing system they use is 
the standard one, confident that there is only one that 
can be used, and believing that in any case a little 
transformation here or there would allow easy 
conversion if a problem arises or change occurs. The 
key to overcoming differences in spatial referencing 
systems between data sets to be combined is finding a 
feature in the data sets common to both (suchas a name 
or code), or if none exists finding a third (or fourth) 
data set, which in turn contains an attribute present in 
each of the original data sets, through which to link 
them. Integrating into the ETC database population 
data from the preliminary results of the 1991 census, 
for example, entailed linking three data files. The data 
provided to us was spatially referenced via a ‘district 
code’ item which was different to the district codes 
stored in our database. We then had to use an 
intermediate file which contained our data sets ‘district 
code’ and the population data sets ‘district name’ to 
link the population data to our district maps. The 
process thereby becomes more and more complicated 
and time consuming, and data quality becomes more 
and more questionable as the link between the first and 
last data set becomes less tenable and additional data 
sets introduce further potential inaccuracies. 

A second significant technical limitation to data 
integration arises due to inconsistencies in attribute 
structure, such as non-standardized spelling or coding 
of addresses, abbreviations and numbers, The absence 
of effective and recognized data standards, or GIS 
implementation guidelines, remains a serious issue for 
data sharing and integration. Ithas the effect of ensuring 
that information created within one department or 
ETC local authority is rarely compatible with another. 
Problems arising from the lack of standards include 
poorly defined accuracy objectives and attribute coding 
(decisions are often left to the discretion of those 
entering the data) and a lack of consistency between 
projects. In the case of social and economic data sets, 
for example, the data are not location specific and 
therefore rely on accurate and consistent attribute 
detail (such as administrative area names or codes) to 
be ‘located’. If an attribute such as an address is the 
only means of locating and linking information derived 
from different sources, it then becomes imperative that 
the naming and/or addressing systems used are 
compatible, but this is not usually the case. As part of 
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the ETC project data concerning landfill sites were 
derived from two different sources, with the ‘site 
name’ representing the only link between the two data 
sets, table 2 illustrates the extent to which the attribute 
structure was standardized both within each data set 
and between the data sets. 

A third problem area common to integrating data 
sets is scale. For any given study area data are rarely 
collected and mapped at compatible map scales. 
Multiple scales of data can be equally accurate, but 
inconsistent when integrated to a common scale. The 
outcome of this may be sliver errors and resolution 
problems. It is also apparent that different results may 
be obtained from statistical analysis on the same set of 
data grouped at different scales (Benedetti!?). 

The questions of data quality which may arise as a 
result of data integration are not as easily identifiable 
as the technical problems. For example, since the 
reliability of map and attribute data is often not 
quantified within the local government setting (with 
the marked exception of data concerning the authorities 
local financial base, such as rent and local tax 
information, and the data which may be used for 
checking the block grant allocation), it becomes very 
difficult to assess how meaningful such data is for 
secondary use. The lack of standardized quality control 
measures, which can determine the reliability of 
information, often prevent data custodians from 
confidently releasing their data and users from 
applying it. 

In many cases, however, data quality must be 
accepted so as not to preclude analysis altogether. 
Many agencies have monopolies on information 
because alternative sources of data are rare or expensive. 
In the case of the ETC, for example, a number of 
significant data sets are only available from one source, 
and where the analysis to be carried out is significant 
their data quality must, within reason, be accepted. 
What cannot be accepted is the use ofa data set without 
being aware of the accuracy, currency and uncertainty 
associated with it. Insufficient information describing 
the quality of a data set renders many available data 
sets potentially unpredictable and ultimately unusable. 
It is, therefore, critically important that data specific 
error (for example positional and representational error 
in spatial references) and uncertainty details should be 
stored with each data set (Openshaw"*). According to 
the US National Committee for Digital Cartographic 
Data” a full described data set will contain details of 
the following elements: 

e Lineage (data origin), 

Positional Accuracy, 
Attribute Accuracy, 
Logical Consistency, 
Completeness. 


Possibly the most significant lesson to be learnt 
from compiling such a database 'puzzle', is that very 
different answers can be derived from similar analyses 
based on the same data, and that very wrong answers 
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can be derived using perfectly logical GIS analysis 
techniques, if the user is not aware of the particular 
peculiarities of their data. 


Conclusion 
If urban planning and land management are to be 
effective they must be built on a firm foundation of 
well defined data systems. Data systems ultimately, 
however, are dependent on data availability, ease of 
integration and inherent quality. GIS provides us with 
the potential to enhance strategic planning by the integra- 
tion of public data sets, and to create a profile for an 
area lacking definition, a role essential to successful 
planning and land-management. GIS should focus our 
attention on the underlying quality and utility of data. 
It is apparent from this analysis that institutional 
and organizational data issues have a profound 
influence on the extent to which opportunities offered 
by GIS will be realized in practice, and ultimately can 
determine the success or failure ofa GIS project. Only 
by carefully examining the role that spatial information 
plays within the larger institutional context will the 
dynamics of data sharing and integration be fully 
realized. Such an understanding can potentially lead to 
the development of rational data sharing policies and 
standards. 
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Table 1. Comparison of GIS Uptake Nationally and in the ETC, 1991. 


Local authorities 
considering the 
introduction of GIS 


National Survey 
(Masser & Campbell) 


ETC Survey 
(SERRL) 


69.3% 


68% 


Table 2. Alternative Landfill Site Names 


Source One: 


Moor Hall LFS Romford Rd 
Baldwins Farm 


Refuse Tip Hollow Rd Widdington 
Bowers Marshes Basildon Essex 
Squerries Pit 

Stockley Rd Tip London SE11 
Braintree Road 
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Local authorities 
which have 
acquired GIS 


Local authorities 
with firm GIS 
purchase plans 


8.696 16.5% 


5% 23% 


Source Two: 


Moor Hall and Sandy Lane 

Baldwins Farm Barling Magna Essex 
Hollow Rd 

Marsh road 

Squerryers Sand Pit Westerham Kent 
Stockley Park 

Shalford Pits Braintree Essex 
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Using language analysis to manage information 


Godfrey Smart 
Peingown, Kilmuir, Isle of Skye, ГУ51 9UB 


The ESPRIT Project SIMPR developed software to analyse documents and generate indexes for them. Of immediate 
application as a document indexing and classification system, this also offers a technology for information modelling 
that has broader implications, supporting many new uses for information management software. The Project was 
based on the assumption that information can only be managed successfully by computer systems that can view the 
information contained in a document through the language in which the document is written, and that systems need 
to be sufficiently flexible to respond to the changing requirements of document use. 


The starting point for the SIMPR Project was the 
hypothesis that increasing automation of information 
processing has now caught up with the ‘paper-book’ 
model of information. 

A book is linear. Most printed books (i.e., works of 
fiction) only make sense if you ‘begin at the beginning, 
and go on until you come to the end: then stop’ (King 
of Hearts'). Non-fiction textbooks assume the reader 
starts in a state of ignorance, and progress down a 
continuous stream of knowledge until all relevant 
information has been imparted, in an orderly fashion. 

This information model is derived from the structure 
of printed information. Starting at the top left of page 
first, ending at the bottom right of page last, words 
flow in a long line that is straight (although folded 
many times). Illustrations, graphs, and tables only 
interrupt the linear flow, causing a nuisance to 
typesetters and designers. 

The linear model is so ingrained in our habits that 
we structure our thoughts to correspond to it, and 
impose it on the design of our computerized information 
systems. Bibliographies, directories, even full-text 
databases, are linear sequences of information, neatly 
arranged in alphabetical or chronological sequence. If 
we wish to navigate them in a different order, we are on 
our own, struggling to use and control Boolean logical 
operators and search and retrieval interfaces of varying 
complexity and unfriendliness. The fact that these 
systems are difficult to use successfully has been well 
documented (see, for example, Blair and Maron? °). 
Typically, extensive search will result in recall of 2096 
of the required information. 

Itis not surprising that information is mapped from 
booksonto computer information systems in this linear 
way, since most computer information systems are 
still merely electronic copies of paper information 
systems: dumps oftypesetting tapes or word-processor 
files used to author printed documents. Few people 
have yet confronted the task of authoring inform- 
ation directly onto an electronic medium. Even 
hyperdocuments tend to be planned in a linear 
sequence, with jumps and tributaries mapped onto 
the underlying regularity, and concern spent on 
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preventing the reader becoming lost in the resultant 
complexity. Why? Does being lost in hyperspace lead 
to our being sucked into the computer through the 
screen, to float in the ether crying for help like an extra 
from ‘Poltergeist’. 

I propose instead the ‘bran tub’ model of 
information: stick both hands in, rummage around to 
see if you can find anything interesting, and if not, 
stick your hands back in and try again, Let the software 
worry about where you are, how you got there, whether 
you can get back, how many more things are left in the 
tub for you to find. 

Although the Project partners might not recognize 
it, this was the underlying stratagem for the SIMPR 
Project (ESPRIT 2083). A 3.5 year project that 
concluded its work in June 1992, SIMPR spent over 60 
man years and £5 million of effort in investigating 
ways we can use natural language analysis to model 
and manage information. 

SIMPR built a document analysis and indexing 
system that includes software developed by the 
Research Unit for Computational Linguistics (RUCL) 
at Helsinki University, Finland, for morphological and 
syntactic analysis, and software developed by the 
Department of Information Science at the University 
of Strathclyde, Scotland, and by Computer Resources 
International (CRI), Denmark, for automatic extraction 
of index terms from the results of linguistic analysis of 
texts. These software components were integrated by 
Cap Gemini Innovation in The Netherlands. 

Otherpartners in the project were The TNO Institute 
from The Netherlands (who researched user 
requirements for information management systems), 
the Nokia Research Centre of Helsinki, Finland (who 
developed techniques for modelling document 
domains), University College Dublin (UCD) (who 
experimented with the use of machine learning for 
automatic document classification), Dublin City 
University (DCU) (who developed software to analyse 
document structures) and the Catholic University of 
Portugal (who researched potential applications for 
using the system in Computer-Aided Instruction). CRI 
in Denmark coordinated the project. 
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Automatic indexing of information: Knowing 
‘where it is’ 

SIMPR indexes a document by identifying those words 
within the document that are significantly meaning 
bearing; emphasis is thus placed on the quality of 
index entries rather than their quantity. SIMPR uses 
techniques developed from linguistic analysis that is 
surface-near: that is, it is not constrained by the deep 
semantics of the domain, but uses morphological and 
syntactic information only. SIMPR incorporates three 
knowledge sources: 

@ Linguistic knowledge of the kinds of morpho- 
syntactic word sequences that might be high 
meaning-bearing. 

e Expert indexing knowledge of how these 
sequences should be restructured to maintain 
consistent presentation to the user. 

e Knowledge of the purpose for which the index is 
intended. 

SIMPR processes documents that are to be stored 
in an electronic information base. It prepares each 
document for storage, indexes it, and maintains a 
model of the information base into which the user 
locates the document. When the user then issues a 
request for information, this request is used to search 
the indexes of the base to locate one or more suitable 
documents. From these locations the user can browse 
the information base structure to see if other documents 
are also relevant. Storage and search are controlled by 
lexical, morphological, and syntactic processing 
modules, not just by character matching. 

SIMPR manipulates a document as texts. Each text 
is defined as a heading and the paragraphs that follow 
it, up to—but not including — the next heading (whatever 
its level). SIMPR’s indexing software (MIDAS) 
processes each text for storage by indexing it, to 
extract words and phrases with a high meaning content 
(analytics), and use them to construct a disambiguated 
list of index terms. 

Each text is processed in three stages: 

1. Pre-processing and structural analysis 
2. Morphosyntactic language analysis 

3. Indexing 

Pre-processing removes non-indexable material 
from the text (mathematics, artwork, program code, 
and so on) and analyses the structure of the document 
to identify the boundaries between texts (the headings) 
and between sentences. SIMPR processes documents 
that conform to ASCII standard, but can analyse SGML 
markup to recognize headings and specialize processing 
of elements such as captions, footnotes, or tables. 

Each text is then processed to identify the 
morphology and syntax of each word in each sentence 
of the text. This language analysis is performed by a 
new technique based on the use of constraints. The 
constraint approach starts by listing all possible lexical 
or syntactic interpretations of a word, and then using 
available information to constrain these interpretations, 
by removing those that are invalid in the context of the 
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word being analysed. The goal, of course, is to remove 
all interpretations except one, the correct one. This 
approach is implemented using a constraint grammar 
(Koskenniemi*). 

The advantages of the constraint approach include: . 

e It always produces an interpretation of each 
word, although some words may be considered 
ambiguous, with more than one interpretation 
remaining. 

e Real syntactic and semantic ambiguity is 
preserved and passed on to the next software 
module. 

e Basic system functionality is language 
independent, and implementation of the software 
can be made for any language by defining suitable 
rules of analysis and supplying a lexicon. The 
constraint-based language analysis system has 
been tested on most European languages and 
several non-European languages. 

e Nosemantic knowledge is needed by the system; 
language analysis is based solely on lexical and 
syntactic principles. Thus any text in the language 
can be processed and the system is domain 
independent. 

Output from language analysis is passed to the 
indexing software, which identifies and extracts text 
sequences likely to represent the meaning of the text: 
candidate analytics. The indexing software (MIDAS) 
incorporates expert indexing knowledge, to identify 
word sequences that are likely to be high meaning- 
bearing phrases (PARWOS) and to generate analytics 
from them. MIDAS thus uses the morphosyntactic 
analysis oftexts to extract all useable index terms, with 
the minimum amount of unwanted terms. 

Some further processing of the extracted analytic- 
rich word sequences is needed before they can be used 
as index terms. The main stages in this processing are: 

e resolution of complex word sequences involving 
structures such as conjunction and disjunction; 

e normalization and transformation, to ensure that 
candidate analytics conform to standardized 
structures and word forms, using a small number 
of rules, and to remove stop-words from 
candidate analytics; 
consolidation, to bring together duplicate and 
subsumed analytics, retaining the most specific 
form; 

e filtering, to remove unwanted analytics from the 

candidate set. 

SIMPR’s language analysis and indexing comp- 
onents are operational on a UNIX workstation (SUN 
System 3 or SparcStation). The main SIMPR software 
system processes a typical input document at a speed 
of approximately 1,000 words per minute (for all stages 
of preprocessing, linguistic analysis, and indexing). 


Information retrieval: ‘Finding it’ 
Texts are retrieved following a search of the index, or 
by browsing graphical models of the structure of the 
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information base. SIMPR’s retrieval software running 
on the SUN workstation, PROSPECT, uses graphical 
displays to support searches by users who are not 
expert in Boolean logic, or to support information 
browsing. PROSPECT and MIDAS can be decoupled, 
for example to use MIDAS to prepare validated SIMPR 
indexes which can then be searched using a 
conventional text retrieval system. 

PROSPECT uses the same language analysis and 
indexing software as is used to index documents for 
the information base. Thus it can accept full natural 
language search queries, and can search using the 
morphological base form of query words, finding 
matches of verbs against nouns, participles against 
adjectives, and so on. 

The Project also investigated the problem of 
classification. Most commercial interest in new- 
generation IR systems focuses on automatic 
classification against a user-supplied thesaurus rather 
than automatic free-term indexing. Information 
consumers, so the argument goes, are used to working 
with a thesaurus, and the need is to continue to support 
this and match documents against thesaurus terms 
automatically. 

SIMPR cannot do this, nor can any other currently 
available generic software system. Current technology 
can deliver automatic classification systems only by 
working in a closed and restricted domain, where 
terminology and concepts are essentially non-volatile, 
and can be captured and mapped into a knowledge 
base. This knowledge base is built by hand constructing 
a semantic network of subject headings as they apply 
to the domain, and for each generating a rule set to test 
an input document to see if it can be classified against 
that node on the network. Deciding the correct level of 
detail (choosing broader or narrower terms) remains 
an intractable problem. 

One approach, which SIMPR investigated, was to 
use automatic learning algorithms to analyse corpora 
of classified documents, to see if the system could 
learn links between index terms extracted from a 
document by SIMPR and the thesaurus codes attributed 
to the document by a human classifier, This work was 
performed by University College Dublin, and is 
partially documented in earlier papers on the Project 
(Gibb and Smart**"). It demonstrated the feasibility of 
the approach, but there were not funds available to 
complete the work and develop functioning software. 

A similar approach would be to index a large 
corpus of classified texts, and use statistical analysis to 
detect patterns of word occurrence that could be used 
to determine level of detail, significant mention, and 
hence connections to a thesaurus code. After all, a 
thesaurus is a latent network model of a domain, 
linking its main concepts in a hierarchical structure 
and cross-connecting these in a hyperspace-like way. 

The main reason the SIMPR Project diverted funds 
from classification research to the core problem of 
automatic indexing was the growing opinion that 
effective indexing made classification unnecessary; if 
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a searcher can quickly and interactively investigate 
every meaningful term contained in a document, and 
explore connections between different subjects that 
are often discussed together, if he can dip his hands in 
the bran tub as often as he likes, why should he need to 
follow a pre-digested route to the information based on 
somebody else’s opinion on what a document is 
‘about’? 

This perspective on search led to the completion of 
SIMPR’s automatic indexing software, and to the 
quick and simple PC-based retrieval systems described 
below. These are designed to solve some of the problems 
encountered when using conventional information 
retrieval software systems. 


The problems of conventional search systems: 
‘knowing what you’ve found’ 

Current document management systems are primarily 
still based on technologies developed in the 1960s: file 
inversion for indexing documents, and Boolean logic 
(using AND, OR, and NOT operators) for retrieval. A 
user requests information by specifying a search query 
containing one or more search terms. The characters in 
these terms are used to search files for matching sets of 
characters. The main limitations of this technique are: 


e Relevant documents may not contain the exact 
term(s) specified by the user; the documents 
may contain a synonym ora more or less detailed 
term. 


e Irrelevant documents may contain the specified 
term, but it may be used in a figurative way or in 
a different sense. 


e The specified term may be ambiguous or too 
general, resulting in recall of irrelevant 
documents. 


e Combinations of terms may express logical 
relationships rather than semantic ones. 


Various attempts have been made to develop 
techniques to overcome, or at least compensate for, 
these limitations. Perhaps the most interesting is vector 
space modelling, which lets the system answer the user 
when he states 'this document is interesting, find 
me more like it’. But this still begs the questions 
‘how interesting?’ ‘how many more?’, and ‘how much 
like it?’. 

Commercial text retrieval and online systems 
include tools to increase the complexity of a search, 
but not to discriminate or focus its results. Research 
conducted during the SIMPR Project indicated that 
most users, even professional search intermediaries, 
do not make use of complex features of information 
retrieval systems. They conduct a simple search, 
based on a simple search query, and then try to 
broaden the search if the resultant set is too small, or 
narrow it if the set is too large. Acceptable size of set 
seems to be mainly arbitrary: less than 10 is too small, 
more than 150 is too large. Anything in between is 
usually offered to the information requester as the 
results of the search. 
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For example, consider the search shown in Figure 1, 
done on Dialog in Pollution Abstracts. 


The user first searches in a simple, intuitive way, 
using the phrase that described his interest. The 
number of hits (5459) is far too many for 
downloading. The user now has no clear strategy to 
use to find useful hits from the set. 


? begin 41 


File 41: 
POLLUTION ABSTRACTS 70-92/JUL 
(С. CAMBRIDGE SCIENTIFIC ABSTRACTS) 


Set Items 


? s air pollution 


Description 


S1 5459 AIR POLLUTION 


The user now attempts to modify the result by using 
logical operators. He requests documents that contain 
the word air OR pollution: 


Set Items Description 


? s air or pollution 


27065 AIR 
35368 POLLUTION 
S2 48589 AIR OR POLLUTION 


Since this makes the problem worse, he next tries 
asking for documents that contain both terms 
together, using the ‘W’ operator which ‘requires the 
terms on either side of the operator to be adjacent 
and in the specified order"; 


Set Items 


Description 


? s air (w) pollution 


27065 АК 
35368 POLLUTION 
53 12226 AIR (W) POLLUTION 


The user still has no information to discriminate 
useful from irrelevant hits. (Is it clear why S3 is over 
twice the size of S1?) 





Figure 1. 
Conventional search for information on air pollution 


In a real search process, a typical user would try to 
reduce the set of 5459 hits down to less than 150, by 
such attempts as: 
e Looking only at articles published in the last two 
years 
ə Looking only at articles published in English 
e ’And’ing the retrieved set with another term that 
might focus it onto the needs of the information 
requester (as interpreted by the searcher); for 
example, (air pollution) AND (control), or (air 
pollution) AND (reduc?). 
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None of these attempts will lead to predictable 
results, and any of them is likely to reduce the recall 
achieved in the search, without any guarantee of 
increasing precision. АП 5459 might be relevant to ‘air 
pollution’. The probability is that the searcher’s interest 
is narrower than this, but ill-defined, and cannot be 
stated more precisely without some awareness of what 
is in the database. 


Search refinement using natural language 

The results of searching a SIMPR index are shown in 
Figure 2. This shows a search through an index built 
using SIMPR, of approximately one megabyte of texts 
on environmental law (a 500 page textbook). The 
results of the search are presented by SIMPR’s retrieval 
interface as an ordered list, with the best match hits 
presented first (those that contain both words in the 
user’s query) followed by poorer matches (those 
containing only one word). The user can browse the 
list, can choose which hits are of interest and are to be 
downloaded or printed. 

For purposes of illustration, all hits are shown in 
this example. The number of hits is smaller than the 
Dialog search because the database searched was 
smaller. The index phrases extracted by SIMPR, shown 
in the example, have been extracted fully auto- 
matically, and have not been validated. Thus Term 61 
contains noise (‘as regards’) but is still meaningful, 
indicating a document that says something about 
water pollution. 

But notice that relevant hits have been found by 
partial match that would have been excluded by 
conventional tactics to narrow search: Term 124, for 
example, or Term 85 or 94 or 117 or 133. Examination 
of headings and texts will reveal other terms that might 
drive useful searches: ‘noxious gases’, for instance. 
And some of the general references to ‘pollution 
control’ might well be worth investigating. 

In a working environment, the index terms could 
be validated to reduce noise, or the retrieval system 
could show partial match hits only if the number of 
complete matches fell below a threshold level; in this 
case that would present the user with the first 34 hits, 
indexing 56 texts; the remaining 99 hits could be 
viewed on request. 

But however large the number of hits, the user still 
has a useful mechanism to refine search, because the 
retrieved set consists of meaningful phrases that can be 
browsed. And as the example shows, interesting 
information can still appear very low on the list of 
matched hits, however they are ranked. By browsing 
the list of SIMPR index terms, each phrase that looks 
interesting can be marked for retrieval, and the 
remainder discarded. Retrieval can itself be performed 
in two stages: first to view the heading of the document 
section indexed using the selected term, then to retrieve 
the text itself, if the heading looks interesting. 

This system, and the software that supports it, 
allows the user to indulge the bran tub model of 
information retrieval. 
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text analytic ў noof пооҒ 
d А words texts 


words texts 





air pollution 
industrial air pollution 
principal legislation controlling air pollution 
combating air pollution 
excessive air pollution 
air pollution control : implications 
air pollution control point-of-view 
air pollution control industries 
air pollution control equipment 
‚ air pollution control : start 
. air pollution : effective prevention 
. air pollution control 
. air pollution appeal board 
. air pollution control legislation : enforcement 
. air pollution : source 
. air pollution control areas 
. air pollution : control 
. air pollution emanating : certain forms 
. air pollution : cases 
. air pollution control : practicable means 
. air pollution levels 
. air pollution resulting 
air pollution : effective control 
. air pollution control improved methods 
. air pollution control costs 
. controlling air pollution : costs 
. available air pollution control 
. chief air pollution control officer 
. South — africa : powerful person enforcing 
air pollution control 
. chief air pollution control officer : role 
. urban areas : serious air pollution problems 
. mines : air pollution control 
. high air pollution potential 
. industrial air pollution control 
. pollution 
. injurious pollution 
. environment : air 
. solid waste pollution 
. Sea-shore : pollution 
. chemical water pollution 
. causing water pollution 
. lake areas : pollution 
. sea: pollution 
, resultant pollution 
. unlawful pollution 
. harbours : pollution 
. produce pollution 
. public : pollution 
. biological pollution 
. oil pollution 
. marine pollution 
. vehicular pollution 
. cases involving pollution 
. water — pollution 
. harbour waters : pollution 
. criminal pollution 
. environmental pollution 
. State land : pollution 
. geographical areas : pollution 
. reducing pollution 
. as regards water pollution 
. industrial water pollution 
. freshwater pollution 
. water : pollution 
. noise pollution 
. sea — pollution 
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. water area : pollution 
. fishing harbours : pollution 
. water pollution 
. industrial pollution 
. marine oil pollution 
. general pollution 
. preventing oil pollution 
. pollution : sources 
. pollution control endeavours 
. pollution control 
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. pollution control specific forms 
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. air: prevention 
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. pollution control 
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. pollution : statutory form 
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. pollution extent 
. airnavigation . 
. pollution effects 
. pollution prevention 
. pollution standard 
. pollution control : objectives 
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. pollution prevention 
. pollution danger 
. pollution act 
. south - african industrial 
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. oil pollution act : cases 
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. Specific air quality standards 

. marine pollution control 
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Figure 2.1 Figure 2.2 
SIMPR search for information on air pollution SIMPR search for information on air pollution 
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Back to the bran tub 

A fast retrieval system that gives a useful indication of 
the content and context of each hit results in a different 
approach to information search and retrieval. Instead 
of constructing a complex search query, and developing 
a strategy for search refinement, the searcher can just 
dip into the information base, using a word or two that 
seems relevant to his interests. By examining the 
results, downloading or printing anything that looks 
interesting, more useful search terms can be identified, 
and used to find more documents. 

Artificial search mechanisms, such as truncation, 
wild cards, logical query formulation, become 
irrelevant. It is quicker to type in a few key characters 
ofa term and let the system find it than laboriously (for 
those of us too lazy to touch type) key in a lengthy 
noun or phrase. Finding information is best achieved 
by dipping in, looking at the results, saving what is 
worth keeping, dipping in again using a different term. 

The PC-based retrieval software that has been 
developed using this technique currently uses simple 
string matching techniques. More complex techniques 
can be used ifrequired. Morphological matching would 
be useful, especially since the SIMPR indexes have 
already been consolidated on morphological form. 
This could be achieved by simple stemming at retrieval, 
or by use of morphological look-up tables, to find the 
base form of any query word. 

On a top-end PC simple morphological analysis 
could be performed automatically. In languages other 
than English, techniques that depart completely from 
word matching have been found effective. The Dutch 
and German languages, with their changes of spelling 
with inflexion and compounding, respond well to a 
technique developed by the SIMPR Partners TNO in 
Holland, which matches using three-letter combinations 
(trigrams) from the query terms that are found to be 
statistically significant. This technique works less well 
for English, where it just introduces noise into the 
retrieved set. But use of both techniques in the same 
system gives a foundation for a multilingual interface 
to search a multilingual database. 

The SIMPR retrieval approach could be combined 
with vector space techniques, to use the entire retrieved 
documents, or the relevant parts of them, as input to the 
system. These can be analysed and indexed, and the 
information base searched for other documents with 
correspondence to the set of index terms extracted 
from the retrieved set. Systems such as CLARIT from 
Carnegie-Mellon include measurements of statistics 
of word occurrence and co-occurrence, and can 
implement this form of search feedback; as yet SIMPR 
cannot. 

Ifthe bran tub model is used to retrieve information, 
there is no longer any need to store it in a structured or 
linear way. Document sections can be authored, indexed, 
and stored as they are written, without concern over 
how they relate to each other, until the subject matter 
of the application has been fully documented. The 
software can help to determine whether a topic has been 
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documented, simply by conducting a search for it. If 
the search does not produce the required information, 
and the information is authored or found elsewhere, 
then it can be indexed and added to the system. 

Users do not then ‘read’ an information base as 
they would a book, they dip into it when they need to, 
to find the answer to a question. A successful search is 
one that finds the required answer, or that finds all 
relevant information. 


Applications for SIMPR: ‘Using it’ 
As an indexing and retrieval system, SIMPR is 
implemented as an indexing server, on a workstation, 
supplying processed information to a distributed 
information base stored on, and accessed via, PCs. 
This configuration can be used in a large 
organization or library, to index information and 
manage access to it. Or SIMPR can be used to provide 
an indexing service, with the results delivered in three 
basic forms: 


1. An Electronic Book, that combines: a set of files 
containing textual information (and associated 
drawings, photos, etc.); an index to the texts, containing 
phrases extracted from them automatically; a software 
retrieval system to enable the index to be searched and 
relevant information retrieved and displayed. 

The Electronic Book invites a user to type in a few 
words that identify his/her interests. The software then 
displays the.index terms that match the words the user 
hastypedin. When the user selects one ofthe displayed 
index terms, the heading, and then the text, of every 
document (or every page) that was indexed using that 
term is displayed to the user. 

The applications for the Electronic Book could 
include: 

e an electronic textbook, 

e a hyperdocument, 

e a technical manual, 

e a collection of reports or short documents or 

abstracts, 

e a full text database. 


2. An Electronic Catalogue; this operates in a similar 
way to the Electronic Book, but does not contain the 
indexed information, just the pointers to where that 
information can be found. A user enters a search query, 
and then browses a set of index terms that matched the 
query. When the user selects a relevant index term, the 
Electronic Catalogue supplies the name and details of 
the document that contains the required information. 
The Electronic Catalogue provides an electronic 
index to a set ofjournals or books, or an index to a suite 
of technical journals, or a catalogue for a collection of 
reports or short documents. It can index a much larger 
volume of information than can be contained in the 
Electronic Book, and could be used to provide an 
index to a very large full-text database. 
3. The Electronic Marketing Assistant, which puts a 
specialized, rule-based retrieval interface on top of an 
index to marketing information: brochures, product 
descriptions, technical literature, etc, for a company's 
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product range. It is aimed at sales and technical people 
within companies who need to carry large amounts of 
technical information around with them, or access it 
remotely. It gives them a chance to replace a briefcase 
full of ring binders containing product descriptions, 
technical reports, specifications, price lists, etc., witha 
notebook computer containing all the required 
information and a specialized index to access it. 

These three types of system exploit the use of 
natural language software to store and search for 
information. In addition to this basic application, 
SIMPR can be developed to implement other more 
advanced systems for information management. These 
include: 


e Natural language interfaces to software systems. 


e Classification of documents using a thesaurus or | 


classification system. 


e Extraction of phrases and sentences from a 
document, for use in preparation of a document 
abstract, or in machine translation based on 
phrase substitution. 

Management of large and complex sets of 
documents, such as suites of technical documents, 
or project documentation, including reuse of 
document components, version control, 
updating, and so on. 

e Electronic publishing systems, to deliver 
information ina searchable, interactive electronic 
document, or to manage images scanned and 
recognized using a DTP/OCR system. 
Checking documents for compliance with 
standards of presentation, style, or electronic 
representation, and conversion of information 
between different sets of standards. 


e Information profiling and modelling, for 


cooperative working and teleworking support, _ 


interactive training systems (Courseware), 
knowledge base construction for expert systems, 
and so on. 


Conclusion 
SIMPR is one of the first commercially-available 
information management systems based on techniques 
of natural language analysis. It shows that by developing 
software that can manipulate information using the 
same language structures that humans use to manipulate 
information, we can build computer systems that are 
easier to use, require less expertise, and can make 
information more readily available. 

More significantly, natural language systems reveal 
how we model our methods for handling information 
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on the perceived limitations of paper-based systems. 
Electronic systems offer great flexibility, and oppor- 
tunities to build and manage more complex models of 
information. Software can handle the complexity of 
information searches through multi-faceted information 
networks, provided it can be based on some technique 
to discriminate information based on its content. We 
perform that discrimination by scanning the words 
used to record the information, and natural language 
systems can follow the same approach. 

Intelligent systems based on natural language 
analysis offer a flexibility to support different human 
work patterns, and eventually can make redundant 
many of the artificial techniques we currently use to 


: handle information. 
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1. Introduction 

Corpus Linguistics is the study of large, computer- 
held bodies of text, or ‘corpora’. In the last five years, 
this approach to language study has become 
increasingly popular among linguists, and develop- 
ments in computing technology and software and in 
storage mechanisms like CD are making it possible 
even for the individual PC user. The aim of the linguist 
is to describe the language, and corpus linguistics 
reflects the shift in academic focus from the brain to 
the text as the appropriate source of information. A 
description derived by introspection will tend to be 
idiosyncratic and partial, since no individual has total 
awareness of how they or others use language. A 
description based on the observation of appropriate 
corpus data, on the other hand, can provide a broader 
view of language use, including statements about the 
relative typicality of individual features based on their 
frequency of occurrence in the corpus. 

On the corpus-linguistic continuum, the study of 
raw ASCII text is situated at one end, and the study of 
heavily pre-coded text at the other. The work of the 
Unit sits very much at the ‘raw’ end, a philosophical 
position which is reflected in the title of this paper. 

Some corpus linguists are concerned with language 
description for its own sake; we study the patterns of 
language in order to see how the computer can be 
made to discover and exploit these automatically, in 
order to facilitate even larger-scale study, and to solve 
problems in associated areas, such as text retrieval. 

In our approach, since the computer does not move 
away from the text, we must discover ways in which it 
can be made to work with what is available in the text 
itself. Accordingly, the basic units of information are 
words, singly and in combination, word frequencies, 
and the positions of words in relation to each other. 
‘Sticking to the text’ in this way leads us to develop 
systems that do things differently from the way a 
human would, and I would like to explain how this 
approach works with reference to some examples of 
recent work in the Unit. 


2. Recent Research 
2.1 Using Word Frequency to Identify 
| Changes in the Lexicon 

There is a need in several fields, among them language 
teaching, information technology and lexicography, 
for an automatic means of discovering facts about the 
vocabulary of the language and about how it is 
changing. Ina large DTI-SERC funded project, entitled 


‘AVIATOR’, one of our aims has been to develop 
systems that monitor such information, and we have 
now completed the work. 

To establish changes in the language, itis necessary 
to treat text as a chronological flow, rather than as a 
static entity. Our data flow has chiefly been The Times 
newspaper, although other types of text, such as BBC 
World Service data, are also handled. The system 
consists of four filters, each of which records different 
facts. 

Filter 1 is set up to identify new words in the lang- 
uage. Since the computer cannot know a priori whether 
a word is new or not, new words are defined as being 
those that the computer has not encountered before. In 
fact, a human would ultimately have to adopt the same 
criterion. The system compares the contents of the text 
flowing across it with the words it has already recorded, 
and marks and dates first and latest occurrences. A 
sample of output from the ‘ordinary words’ category 
will show what is found by this method: 


NEW WORDS: THE TIMES FEBRUARY 1991 


Feb-06 rector of LWT, says his boardroom is now 
debugged before every meeting. ‘We had a 
bit of 


Feb-14 is attractively handled, and enriched by the 
gloopy dollops of pastiche knight-speke that 


Feb-17 down at any rate.’ The result is an 
allobiography (about other people rather 
than oneself) 


Feb-17 shis youngest son, Richard Brooks, a buppie 
(black urban professional) malcontent. The 
sonh 


Feb-23 in his tracks, force him to unplug the 
acoustiguide and stand and stare? 


Feb-24 It is readily available from tax returns. 
Footering with past failure to face an election 


Mar-02 rationalise, internationalise, and almost to 
common-marketise our birds. It is a task 


Mar-02 Poetsare not very useful, because they are not 
consumeful or very produceful 


Mar-08 national Symbolism to feature the sort of 
boneless-wonder figure drawing that occurred 


Mar-08 tation, Manchester's 808 State represents the 
dance-trance arm of latterday psychedelia. 
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Mar-09 pping over others. It is in these unobtrusive 
arpeggiations that the fire reveals itself. 


Mar-23 s, a few crude paintings. A teeny bit dog’s 
dinnerish. The service is occasionally 
patronising 


Mar-24 light of chandeliers. ‘It’s our attempt to 
de-yuppify the place,’ said Derek Statt 


Mar-24 clash last weekend between England and 
France. Empurple the prose any way you 
like, and 


Mar-31 ome brethren of the Holy Trinity were keen 
choirboy-spotters, Hopkins’s diaries 


Mar-31 publisher, Dr Francis, and Thatcher not the 
ex-leaderene but a certain David. However 


What emerges is a range of lexical phenomena. 
There are bonafide new words, built by the standard 
rules of word formation, such as the blend 
‘acoustiguide’. There are some productive items, built 
by analogy on already extant forms, such as ‘buppie’ 
and ‘de-yuppify’ (after the acronym ‘yuppie’) or 
‘choirboy-spotters’ (after the compound ‘train- 
spotters’). There are some new inflections, such as 
*common-marketise'. However, some of the words 
that are identified are not actually new, although 
meeting the automatic criterion of not having appeared 
in previous text. One example is ‘arpeggiations’. 
This word has not appeared before because it is a fairly 
specialized technical term, that pops up rarely; in other 
cases, an ‘old’ word may suddenly re-enter the 
language, as with ‘poll-tax’, or ‘ecu’. Some words in 
the above sample would strike the human as being 
ephemeral or nonce formations, such as ‘consumeful’ 
or ‘(dog’s) dinnerish’; our system can also identify 
them as such, by noting when and to what degree they 
appear, flourish and disappear again: this diachronic 
monitoring is the function of Filter 4. 


CHARTER - Profile of Collocational Changes 





*num* of the @social charter 
the prime minister’s >citizens’ charter 
the government's planned >citizens’ charter 
the _ proposed european @social charter 
the proposed european @social | charter 
government on the @social charter 
commitment to the @social charter 
the the european @social charter 
that the european @social charter 
nor on the @social charter ` 
elements of the @social charter 
is what the @social charter 
that as the @social charter 


2.2 Using Word Collocation to Identify Changes 
in Word Use 

Filters 2 and 3 can automatically identify changes in 
word use. The computer does not know about senses 
and meanings per se, of course, but we have developed 
a system which records the collocational environment 
of a word and compares this with the environment of 
subsequent occurrences of the word. In order to ‘stick 
tothe text’, we use the criterion of collocational change 
to indicate a change in use. The theoretical assumption 
underlying this approach is that collocation is a type of 
meaning. This may all sound rather esoteric: anexample 
of our output may make things clearer. The following 
lines are instances logged by the computer where the 
word ‘charter’ has changed its profile from collocating 
with ‘company’ or ‘plane’ and meaning ‘specially- 
hired’ or ‘cheap’, to a new profile with a more 
sociological flavour (see below). 

In the profile below, ‘citizens’ is shown as being a 
newly-occurring collocate, while ‘social’ is being 
shown as an established collocate which has suddenly 
increased significantly its frequency of occurrence 
found as far back as the Magna Carta. 

The computer cannot make the final decision about 
whether the new uses of the word ‘charter’ also signify 
new senses. Ultimately that is a matter of human 
judgement. However, an automated system can sift 
through vast amounts of data and create a feasible 
post-editing task for the human. 


2.3 Using Word Repetition and Word Positioning 
in Automatic Abstracting 

The human being is capable of reading a text and 
summarizing its message in the form of a new, shorter 
text. The computer cannot yet do this, and any abstract 
that it creates at the moment has to be made up of 
words and sentences drawn exclusively from 
the original text. This type of abstract is better termed 
an ‘abridgement’. 

Various methods of abridgement are being 
experimented with by academics and companies such 


there would be a 

to be . unveiled next 
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tx although shares are 

tx Sir trevor holdsworth 
tx this is not 

tx the labour party 
when mr norman fowler 
which seeks to ensure 
which both went through 
which risk adding to 

would do to this 

would require the abolition 


BEND MM ——————.— 


132 


Aslib Proceedings, vol.45, no.5 





as software houses but results are not likely to be 
impressive because of the lack of knowledge about the 
relationship between the words and the ideas in a text. 
Sentences are extracted on the basis of frequency of 
occurrence of keywords, or because they contain pre- 
selected phrases thought to mark important stages in 
the structure of a text, such as ‘the process’. Whilst 
there might well be some purpose in exploiting the 
metalinguistic aspect of text, these approaches are still 
too linguistically naive. The result is likely to be an 
odd selection of sentences, not all of which summarize 
a section of the text, and which are not easy to read 
because they are not related to each other. 

Keeping to the principle of ‘sticking to the text’, 
however, my Unit has developed systems of automatic 
abridgement that in most cases produce acceptable 
abridgements. Based on ideas of Dr. Michael Hoey, of 
Birmingham University, our systems variously trace 
the patterns of lexical repetition in a text and use this 
information to select key sentences. Sentences found 
to be most heavily cohesive are deemed to be core 
information bearers. The abridgements are not only 
adequate accounts; they preserve a lexical inter- 
relationship between the key sentences which allows 
these to be read together as a text. 

The following is a 6 sentence abridgement. The 
original article, of 22 sentences in length is provided 
for reference in the Appendix. 


Article from The Times Newspaper, 
31st December 1992 


Bullish Lamont offers no early rate cuts 
By Peter Riddell and Anatole Kaletsky 
31 December 1992 . 


BRITAIN’S economy will do much better next 
year than in 1992, but there will be no further 
reductions in interest rates unless growth falls 
below the Treasury’s expectations, according to 
the Chancellor of the Exchequer, Norman Lamont, 
in an exclusive new year interview with The Times. 


[8] Mr Lamont’s remarks may, however, 
disappoint the City, where many investors have 
been hoping for a further cut in interest rates early 
in the new year. 


[9] The Chancellor said that interest rate reductions 
would be considered only ‘if monetary demand 
was manifestly too low’. 


[10] Asked whether he would expect to change 
interest rates if the economy performed in line with 
the Treasury’s forecast of 1 per cent growth, the 
Chancellor replied with an emphatic ‘no’. 


[12] Mr Lamont said that Autumn Statement 
measures for industry and housing, and the big cuts 
in interest rates and the devaluation of sterling 
since Black Wednesday had ‘created the right 
conditions for confidence and growth’. 


[14] Mr Lamont estimated that as much as two- 
thirds of the impact of the recent three-point 
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reduction in interest rates was ‘still in the pipeline’ 
and added that the ‘very warm welcome’ given by 
industry to his Autumn Statement measures meant 
that there was ‘every chance that they will succeed’. 


This abridgement was generated using our default 
settings. A shorter, four-sentence, version of the 
abridgement contains the sentences 1, 9, 12 and 14. 
Although our system does not apply a weighting to any 
particular section of the text, it tends to select initial 
sentences in journalistic articles because they are 
lexically rich and so achieve the required threshold in 
terms of repetition. This accurately reflects journalistic 
practice, where the essence of the text is typically 
summarized in the opening sentence or sentences. 


2.4 Using Word Clusters in Automatic 

Text Retrieval 
An essential part of helping a database user to select a 
relevant text is discovering a way of conveying 
information as to what the text is about. The abstract is 
an explicit statement of ‘aboutness’; an index is a more 
implicit one. Indexing has long been automated and, 
generally speaking, automatic indexing will be based 
on calculations of frequency and/or relative frequency 
of word occurrence. 

Another aim of the Unit’s AVIATOR project has 
beento investigate the relationship between the patterns 
of words in a text and its conceptual concerns. The 
research builds on a pilot study done in the early 1980s 
by Dr Martin Phillips, a former postgraduate student at 
Birmingham. 

By recording and monitoring the contexts of every 
word in a text, we can automatically identify lexical 
‘clusters’ that echo the topic or topics of a text. We 
shall illustrate what is meant by a cluster by presenting 
a set of clusters extracted from a book entitled The 
Living Planet, for Chapter 1, Furnaces of the Earth. 

The book, by David Attenborough, is written to 
follow the episodes of a television series. It has an 
underlying theme of how nature survives and lives on 
in various hostile environments. Each chapter has a 
distinct topic. Chapter 1 describes the different 
processes by which volcanoes are formed, the kinds 
of devastation they cause, and the way in which 
natural life returns in the aftermath. The set of clusters 
looks as follows: 


Set of Clusters for Chapter 1 of 
*The Living Planet? 

lava ash chamber 

lava splashes vent thrown river 
basalt flows volcanoes erupting ridge 
gas smoke steam 

kilometres long small krakatau anak 
flow currents convection descending 
line junction concealed 

produced catastrophic explosion 
sumatra java emit 

krakatau reclaimed century 

mallee fowl incubation 
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This is obviously an implicit statement of what the 
text is about, and at first sight, it looks rather strange. 
To highlight its particular features and to evaluate its 
success in expressing the concerns of the text, we 
present here a manual abstract and index of the same 
chapter. This work was commissioned by us from 
Information Unlimited, and the abstractor, who is not 
an indexer, nevertheless was kind enough to do both 
tasks for us because of time constraints. It is not usual 
for an abstractor to summarize the chapter as a textual 
unit; she therefore treated it as though it were an article 
in a journal. Her analysis is as follows: 


Manual Abstract 

This heavily illustrated chapter is a dramatic 
description of the Earth’s more spectacular volcanic 
phenomena, both on land and under water. The 
author explains the burning lava flows of Iceland, 
the colossal volcanic explosions of Krakatoa 
(Krakatau) and Mount St. Helen’s, as well as the 
underwater hot vents and hot springs on land. 


The author suggests briefly that such phenomena 
were probably involved in the origins of life on 
earth; he also emphasizes how fast flora and fauna 
colonise the volcanic debris of the devastated 
landscape. 


Indexing Terms 

Major terms: Volcanoes 
Lava 

Minor terms: Emptions, volcanic 
Geysers 
Krakatoa/Krakatau 


Mount St. Helen’s 


Compound terms: Volcanoes — Origins and causes 
Volcanoes — Flora and Fauna 
Iceland — volcanoes 


Reader's remarks The passage is self contained, 
easy to read, with many fascinating facts about 
volcanic activity. The illustrations add significantly 
to its interest. Most intelligent readers would be 
encouraged to find out more on the subject, and to 
continue reading the book. No external information 
is necessary to understand the chapter. · 


Whilst there is a degree of match between the 
words in the abstracts produced automatically and 
manually, the differences that exist are fairly clear. An 
obvious oneis that the automated set of clusters consists 
exclusively of words in the text, whereas the manual 
abstractor does not feel constrained to express herself 
solely in the words of the original text. For instance, 
she uses the terms ‘flora’ and ‘fauna’ in summarizing 
the Living Planet chapter, although only the term 
‘flora’ appears, once, in the text. 

The abstractor’s approach is to have a particular 
readership in mind for each abstract. She assumed in 
the case of our project books that these readers might 
read her abstract in a public library whilst in search of 
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a book on the particular topic. She sees it as her job to 
interest the potential reader in the book. For this 
reason, she includes evaluative comment in he abstract, 
about the intrinsic level of interest generated by the 
topic, about the way the author expresses him/herself, 
and even about the layout of the text. For instance, she 
comments on the degree of illustration in The Living 
Planet. The automated system may also capture some 
metalinguistic comment, but since its criterion for 
entering words into a cluster is statistical, the 
metalinguistic words will have had to have occurred 
significantly, and be rooted in the text itself, to be 
picked up. They will therefore refer more to the 
organization of the text, words such as ‘tables’, or 
‘graphs’, or ‘chapter’, rather than to its interest value. 

Identification of topic is a problem both for the 
machine and the human. A pilot study done with MA 
students at Birmingham revealed that each one saw the 
main concerns of the Living Planet chapter somewhat 
differently. Or, to be more precise, the wording they 
used to express what they saw as its conceptual concerns 
differed. The topic of topic complexity is an important 
one, but cannot be gone into in detail here. Suffice it to 
say that the Living Planet chapter is clearly about 
several things, but two main aspects or stages of the 
topic flow are volcanic eruption and the subsequent 
regeneration of natural life. The manual abstract reflects 
this, and the automatic clusters do too, but do not give 
itas much prominence, because it is expressed in more 
varied vocabulary. However, it has been picked up in 
the clusters: 


krakatau reclaimed century 
mallee fowl incubation 


Another difference between the two abstracts is a 
result of the human abstractor’s ability to ignore or 
weight certain sections of the original text. On the one 
hand, the human tends to ignore examples given in the 
text, concentrating instead on the main arguments; on 
the other, she uses any metatextual or typographical 
information, such as sub-headings, as a guide to focus 
and weighting of the abstract. It would be possible for 
us to modify the automatic system to take some of this 
into account. 

Comparing the index with the automatic cluster, it 
is clear that there are also differences here. In the: 
selection of key terms, the human abstractor has selected 
those that she thought would occur in the minds of 
potential readers, even еу did not necessarily occur 
in the original text. 

It is also customary for a professional indexer to 
index on nouns, and furthermore on plural nouns, such 
as ‘Geysers’ or ‘Volcanoes’. Indexing on the singular 
form of a noun would only be done where it was 
abstract or uncountable, i.e. where it didn’t really have 
a plural form. Whilst clusters can be created where all 
inflections of a word are counted together, or 
*lemmatized', and the clusters made to present just ће 
base (or other preferred) form, the form chosen for the 
cluster must have occurred in the original text. 


Aslib Proceedings, vol.45, no.5 





The automatic clusters are fairly satisfactory in 
that they reflect the changing aboutness of the chapter, 
and have done this purely by ‘sticking to the text’. 
They are less easily readable than the manual abstract, 
but they are potentially more informative than an 
index made up of individual words. 


3. Conclusion 

Nobody can know for sure what the future will hold for 
textual information, but some trends are clear. Some 
texts will continue to exist in at least two mediums, 
particularly where people wish to read them for 
pleasure, or need to carry out close study of them, or 
observe some feature of their original layout. Many of 
these will also exist electronically, where people need 
to receive, extract, retrieve and transmit information 
-quickly. In addition, reflecting the two types of reader 
need, some texts that have hitherto been available only 
in hardcopy, such as old manuscripts and facsimiles, 
will be increasingly converted to electronic form; 
while, on the other hand, texts that are only needed for 
their information content, such as operating manuals, 
will only have an electronic existence. 

Abstractors and indexers may find themselves 
taking the original wording of the text more into 
. account as the focus moves towards the electronic 
medium and away from hardcopy. There will be 
software of the kind described in this paper available 
to influence them in this. 

Automatically-generated products, whilst being 
fast and excellent for some purposes, are not yet all 
readable or reader-friendly. This is partly because the 
computer can only represent the writer’s model of text, 
whereas the human agent, as abstractor or indexer, 
adopts the redder’s perspective. I think that the kind of 
software described in this paper, in addition to being 
used to present finished products to the user, will serve 
a very useful function as an intermediary in the 
information chain. For example, the automatic 
abridgements could be used to find other relevant texts 
in databases; the clusters as a feedback stage in a 
keyword search system, or as a first-stage indexer. 
Abstracts of the abridged variety may be first-drafted 
by computer, then smoothed into a more interpretable 
shape manually. This would suggest a change in the 
role of the human agent, from text creator to text 
editor. 
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Original Text of Automatic Abridgement 

Bullish Lamont offers no early rate cuts 

By Peter Riddell and Anatole Kaletsky 

31 December 1992 

BRITAIN'S economy will do much better next year 
than in 1992, but there will be no further reductions in 
interest rates unless growth falls below the Treasury's 
expectations, according to the Chancellor of the 
Exchequer, Norman Lamont, in an exclusive new year 
interview with The Times. He is bullish about the 
economic prospects and unrepentant about the 
government's performance in the past year. 

Mr Lamont said: ‘Recent evidence has been 
encouraging. We have had very good car sales in 
December and reports of buoyant sales in the shops. 
Surveys of business confidence have improved. There 
is every reason to believe that 1993 will be much better 
than 1992. I would not be surprised if trends in the 
British economy were better than in some of our 
European competitors.’ Mr Lamont’s remarks may, 
however, disappoint the City, where many investors 
have been hoping for a further cut in interest rates early 
in the new year. The Chancellor said that interest rate 
reductions would be considered only ‘if monetary 
demand was manifestly too low’. 

Asked whether he would expect to change interest 
rates if the economy performed in line with the 
Treasury’s forecast of 1 percent growth, the Chancellor 
replied with an emphatic ‘no’. 

He repeatedly expressed confidence that he had 
done enough in his Autumn Statement to ensure that 
his forecasts of economic recovery would be fulfilled. 
Mr Lamont said that Autumn Statement measures for 
industry and housing, and the big cuts in interest rates 
and the devaluation of sterling since Black Wednesday 
had ‘created the right conditions for confidence and 
growth’. Monetary policy had already been relaxed 


‘very substantially’ through the interest-rate cut. and” е ке 


sterling’s devaluation. 
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Mr Lamont estimated that as much as two-thirds 
of the impact of the recent three-point reduction in 
interest rates was ‘still in the pipeline’ and added that 
the ‘very warm welcome’ given by industry to his 
Autumn Statement measures meant that there was 
‘every chance that they will succeed’. The 
combination of monetary relaxation and carefully 
directed fiscal measures had created a climate of 
confidence. 

Mr Lamont was unrepentant about sterling’s 
withdrawal from the European exchange-rate 
mechanism. The ERM had brought ‘enormous benefits’ 
to Europe and had helped Britain to defeat inflation 


136 


during its membership, he said. However, if other 
countries now choose to tie their currencies even more 
closely in narrower ERM margins, the implications for 
Britain would be limited, he said. | 

Mr Lamont was unperturbed by the size of Britain’s 
current account deficit, despite concern in the business 
community that the balance of payments gap will be a 
constraint on economic growth. ‘I’m obviously not 
indifferent to the current account, but I do not regard it 
as my major problem,’ he said. 

The one economic problem that did seem to worry 


. Mr Lamont was the high level of the public sector 


borrowing requirement. 
b 
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CD-ROM bibliographic database searching at 
the KFUPM library: a use analysis 


Mohammad I. Mirza and Moid A. Siddiqui 


Abstract 


The article analyses 2378 CD-ROM bibliographic database searches (by databases used, user status, departments, 
and user needs) conducted at the King Fahd University of Petroleum & Minerals (KFUPM) Library in Saudi Arabia 
during the period July 1991 (when service started) to 31 December 1992. Various purposes for compiling CD-ROM 


statistics have also been discussed. 


1. Introduction 

CD-ROM became available commercially for 
searching in 1985 and is now well placed in academic 
libraries as a reference tool. Now, almost all libraries 
are using this new technology to provide an alternative 
to online, microform, and print to satisfy their users’ 
search needs, The number of CD-ROM bibliographic 
databases continues to increase rapidly because of 
their acceptance by academic libraries. 

The KFUPM Library has subscribed to eight 
CD-ROM bibliographic databases to provide 
reference services to its patrons with greater speed, 
accuracy and efficiency. This article analyses these 
public use databases available in the Reference 
& Information Services Division (RISD) of the 
KFUPM Library. 


Table 1. 
Bibliographic and non-bibliographic CD-ROM databases 


A. Bibliographic Databases 
DATABASE COVERAGE 
1. АВИМЕОКМ ONDISC (ABI) Jan 1971 -Present 
2. APPLIED SCIENCE & TECHNOLOGY Oct 1983 - Present 
3. COMPENDEX PLUS (COM) Jan 1987 - Present 
4. COMPUTER ARCHIVE (СОМА) Jan 1988 - Present 
5 


. DISSERTATION ABSTRACTS ONDISC (DAO) 
1861 - Present 


6. MATHSCI DISC (MATH) Jan 1988 - Present 


7. NATIONALTECHNICALINFORMATIONSERVICE(NTIS) 
Jan 1983 - Present 


8. SCIENCE CITATION INDEX (SC) Јап 1987 - Present 





В. Non-Bibliographic Databases 


DATABASE COVERAGE 
9. CD MARC NAMES Cumulative through June 1992 


10.CD MARC SUBJECTS Cumulative through June 1992 


11.CD MARC 
BIBLIOGRAPHIC 


12, LC MARC 
BIBLIOFILE (English) Jan 1988 - Oct 1991 


13.BOOKS INPRINT PLUS Current Update 
14. ULRICH’S PLUS Current Update 


Cumulative through April 1992 





2. Background 

The King Fahd University of Petroleum & Minerals is 
one of the seven universities of Saudi Arabia. The 
University offers 23 bachelors, 17 masters, and 7 
Ph.D. programs in the fields of management, 
environment design, science, and engineering.’ 

The enrolment in Fall 1992 exceeded 5,730 
students, with 659 faculty and 1,370 technical and 
non-technical staff. Besides, non-K FUPM borrowers 
total 1,101. The faculty is multinational and the medium 
of instruction is English. 

The KFUPM Library has a collection of more than 
290,000 volumes of books and 548,000 audio visual 
and non-print items. The Arabic collection totals about 
25,000 volumes on Islam, the Arabic language and 
general studies.? 75 per cent of the collection is in 
science and engineering, while 25 per cent is in the 
areas of humanities and social sciences. 

The library serves the university community with 
21 professional librarians and 20 support staff. Books 
and periodicals are housed on open stacks, providing 
free and easy access. International standards for 
processing library materials, such as AACR2 and the 
Library of Congress Subject Headings are used. 


3. Computerization 

Established on the pattern of academic libraries in the 
US, the KFUPM Library played a pioneering role in 
computerizing its operations and services, not only in 
Saudi Arabia but also in the Arab world. 

Planning for computerizing the KFUPM Library 
operations and services started in late-1970s. All major 
services — English and Arabic Online Public Access 
Catalogs (OPACS), literature searching of nine national 
King Abdulaziz City for Science & Technology 
(KACST) bibliographic databases through GULFNET 
(Gulf Network)’, searching of approximately 450 
international (DIALOG and ORBIT) bibliographic 
databases*, and processing and controlling of 
interlibrary loans through ап IBM PC АТЗ, were 
completed during the 1980s. 


4. Purpose of the study 
The purpose of this study is to measure the effectiveness 
of the CD-ROM search service and to provide necessary 
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guidelines by which library services can be improved. 
Services that may benefit from a comprehensive 
analysis of CD-ROM search statistics are as follows. 


a) Increasing CD-ROM use 

CD-ROM statistics could be analyzed to establish 
departments of the University which are either under 
utilizing or heavily using the CD-ROM search service. 
Instruction about the CD-ROM service could be given 
to under-utilizing departments so that they become 
aware of the availability and benefits of this service 
and begin to use it. Various research activities and 
programs of departments using the CD-ROM service 
heavily could be described to encourage other 
departments to use the CD-ROM search facilities 
effectively. 


b) Collection development 

CD-ROM search statistics could be used to assess the 
weak and strong areas of the library's collection so 
that remedial measures may be taken to build up a 
strong and balanced collection. For instance, if a 
journal title has been requested on interlibrary loan 
more than 7-8 times by more than one patron as a 
result of CD-ROM searching, then it would be more 
cost effective to purchase the journal. 


c) Online searching 

The more effective use of CD-ROM search services 
will automatically decrease the number of online 
searches needed, resulting in cost savings. 


d) Evaluation of CD-ROM databases 

CD-ROM statistics could be compiled to find out 
which databases are heavily used and which are not. 
To provide a cost-effective CD-ROM search service, 
it is necessary that only heavily used CD-ROM 
databases by retained. 

e) Planning 

CD-ROM search statistics could also be used for long- 
range planning, such as budget development, human 
resources planning, etc. 


5. Literature review 

The literature contains numerous articles on every 
aspectof CD-ROM searching. Several articles describe 
different CD-ROM user surveys carried out in the past 
to measure the effective use of CD-ROM bibliographic 
databases in academic libraries. All surveys were 
conducted through questionnaires to library users. In 
general, the articles revealed a high degree of user 
satisfaction, that some training or assistance is needed, 
and that patrons prefer to search with CD-ROM rather 
than print indexes. 

A survey! was mailed to 150 academic libraries in 
September 1987 to enquire about CD-ROM 
acceptance, use, cost, and potential. Respondents 
indicated that CD-ROM systems were accepted by 
library patrons. Allen* conducted a survey in 1989 at 
the Undergraduate Library of the University of Illinois 
at Urbana-Champaign and concluded that libraries 
should consider providing some training or assistance 
to patrons using bibliographic databases on CD-ROM. 
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Lowry? conducted a study at the Columbia University 
Libraries. The study concluded that CD-ROMs are not 
luxuries, but are essential new resources for reference 
and research in academic libraries. A survey!’ on the 
use of CD-ROMs in Europe was carried out during the 
first nine months of 1989. The survey showed a lack of 
knowledge and awareness about the availability, 
applications, and use of CD-ROMs. Another survey!! 
conducted at Oakland University concluded that 
students were overwhelmingly pleased with CD-ROM, 
found it easy to use, and prefer to search CD-ROM 
over print. 

The findings of a survey’ conducted at Hofstra 
University suggest that there is a high degree of user 
satisfaction; however, a training program is needed. A 
survey of librarians in the United Kingdom and the 
Republic of Ireland revealed that a majority of 
university libraries have adopted CD-ROMs, mainly 
in reference and acquisition and that CD-ROM is both 
useful and cost-effective. At the Pennsylvania State 
University, an evaluation! of CD-ROM was conducted 
in September 1989 and April 1990 to determine how 
users were reacting to this new electronic reference 
service. The results provided valuable information 
about who was using the systems, how patrons 
perceived help available to them, and what type of 
instruction they would desire. The University of North 
Carolina's Davis Library conducted a month-long 
survey! of users in Spring 1991. 

The survey revealed that the majority of patrons 
depend on CD-ROM as their first source when 
conducting research. 


6. Methodology 

A form was developed to track CD-ROM use (see 
Appendix 1). CD-ROM users are requested to complete 
the form before conducting the search. All CD-ROM 
searches conducted during the period from July 1991 
to 3) December 1992 were arranged according to the 
date of search conducted and were filed by month. 
Then, each and every search form was carefully 
evaluated and their numbers recorded on the five 
separate sheets meant for calculating statistics. Raw 
data was tabulated as shown in Tables 2 to 6. 


7. Results and discussion 

7.1 Statistics by searches conducted 

During the period between July 1991 (when service 
started) and December 1992, a total of 2378 CD-ROM 
searches were conducted, as shown in Table 2. 

The CD-ROM search service for public use started 
with only two workstations in July 1991. There were 
two main reasons for under usage of CD-ROM 
databases in the first six months. First, undergraduates 
were not allowed to search databases on CD-ROM 
because it was feared that CD-ROM stations would be 
over crowded and ‘genuine’ searchers (faculty, 
graduates, research assistants, etc.) would not be able 
to use the service. Second, the KFUPM community 
was not aware of this new end-user facility. 
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Table 2, CD-ROM searching use at the KFUPM Library 








MONTH SEARCHES CONDUCTED 
JULY 1991 43 
AUGUST 1991 43 
SEPTEMBER 1991 27 
OCTOBER 1991 50 
NOVEMBER 1991 59 
DECEMBER 1991 53 
` JANUARY 1992 51 
FEBRUARY 1992 104 
MARCH 1992 76 
АРАП, 1992 82 
МАҮ 1992 140 
JUNE 1992 42 
JULY 1992 135 
AUGUST 1992 134 
SEPTEMBER 1992 84 
OCTOBER 1992 456 
NOVEMBER 1992 361 
DECEMBER 1992 438 
TOTAL : 2378 





With the arrival of two more workstations in July 
1992, undergraduates were also allowed to use 
CD-ROM databases, resulting in an increased use of 
CD-ROM databases. The other reason for increased 
use is that instruction in the use of CD-ROM databases 
was provided to different departments. When 
approached, training in the use of CD-ROM is also 
provided to library users. Seminars are also offered 
from time to time. 

By Fall 1992, the use of CD-ROM databases had 
increased considerably. Installation of local area 
networks (LANs) will provide further impetus to the 
use of CD-ROM services. 


Ti able 3. Statistics by databases used 


7.2 Statistics by databases used 
During the period covered by the statistics, a total of 
2378 searches were conducted but the eight databases 
were actually searched 2900 times because some users 
searched more than one database during the same 
search session. 

The Applied Science and Technology Index (ADT) 
is the most heavily searched database with 141 1 (49%) 
searches. AST is very popular among undergraduates 
because it is an easy-to-use general science and 
technology index and, generally, on a given topic 
information is available in the periodical collection. 
Other heavily searched databases are: COMPENDEX 
PLUS (OCM), Science Citation Index (SCI), and 
ABI/INFORM (ABI) with 536 (19%), 320 (11%), 
and 319 (11%) searches respectively. From the 
above statistics, it is clear that a record 2587 (90%) 
searches were conducted using these four databases. 
COMPENDEX PLUS is the second most used CD- 
ROM database and its use has substantially decreased 
the number of online searches. Only 314 (10%) searches 
were conducted using the remaining four databases 
(DAO, NTIS, MATH, and COMA). The details are 
given in Table 3. 


7.3 Statistics by user status 

Undergraduate students conducted the largest number 
of searches, 1080 (45%) because they need up-to-date 
articles to complete their assignments on time. After 
undergraduates, graduate students conducted the most 
searches, 586 (2596). Graduates also require the latest 
research documents to complete their assignments in 
the semester. Together, graduate and undergraduate 
students conducted a record 1666 (70%) searches. 
Faculty and research assistants conducted only 704 
(30%) searches. The details are given in Table 4. 
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MONTH ABI AST COM COMA DAO MATH NTIS SCI TOTAL 
JULY 1991 12 29 - - 3 - 1 3 48 
AUGUST 1991 5 28 - - 11 - 8 1 53 
SEPTEMBER 1991 3 19 = - 8 - 7 6 43 
ОСТОВЕК 1991 8 36 - - 7 - 5 6 62 
NOVEMBER 1991 20 39 - - 6 - 7 6 78 
DECEMBER 1991 12 43 - - 4 - 9 11 79 
JANUARY 1992 9 30 18 - 4 - 5 8 74 
FEBRUARY 1992 23 4l > 40 - 6 - 5 14 130 
- MARCH 1992 14 28 33 - 6 - 5 11 97 
APRIL 1992 9 47 27 - 4 - 8 13 108 
MAY 1992 22 75 46 - 7 - 11 28 189 
JUNE 1992 9 17 16 - 4 - 1 10 37 
JULY 1992 7 83 37 - 12 - 8 23 170 
AUGUST 1992 19 62 49 - 6 1 7 22 166 
SEPTEMBER 1992 8 28 35 - 6 7 7 23 114 
ОСТОВЕК 1992 41 297 100 2 14 4 10 49 517 
МОУЕМВЕК 1992 45 224 73 8 10 13 9 40 422 
DECEMBER 1992 53 285 62 14 п 10 12 46 493 
TOTAL: 319 1411 536 24 129 35 126 320 2900 
PERCENTAGE : (11) (49) (19) (01) (04) (01) (04) (11) (100) 
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We noticed that there is a direct relation between 
the number of undergraduates and the use of AST. 
When the number of undergraduates conducting the 
search increases, the use of AST increases because it is 
the only database available in the KFUPM Library 
which fulfils the needs of undergraduates. The Readers’ 
Guide to Periodical Literature on CD-ROM is under 


Table 4. Statistics by user status 


3 


consideration for purchase to provide another searching 
source for undergraduate students. 


7.4 Statistics by departments 

All CD-ROM searches were conducted by 16 
departments of KFUPM. The Chemical Engineering 
department topped the list with 352 (15%) searches. 











MONTH UNDER- GRADUATE FACULTY RESEARCH STAFF NON- TOTAL 
GRADUATE ASSISTANT KFUPM 
JULY 1991 2 ‚15 16 10 - - 43 
AUGUST 1991 3 7 23 10 - - 43 
SEPTEMBER 1991 7 11 5 4 - - 27 
OCTOBER 1991 20 19 7 4 - - 50 
NOVEMBER 1991 19 30 8 2 - - 59 
DECEMBER 1991 6 28 11 8 - - 53 
JANUARY 1992 4 21 21 5 - - 51 
FEBRUARY 1992 41 31 16 12 4 - 104 
MARCH 1992 23 10 23 17 3 - 76 
APRIL 1992 35 14 22 10 1 - 82 
МАҮ 1992 46 49 32 10 3 - 140 
JUNE 1992 6 13 22 1 - - 42 
JULY 1992 68 39 15 9 4 - 135 
AUGUST 1992 39 35 32 22 6 - 134 
SEPTEMBER 1992 17 25 22 15 5 84 
OCTOBER 1992 275 91 54 29 7 = 456 
NOVEMBER 1992 192 91 33 37 8 - 361 
DECEMBER 1992 277 57 65 31 6 2 438 
TOTAL : 1080 586 427 236 47 2 2378 
PERCENTAGE : (45) (25) (18) (10) (02) - (100) 
Table 5. Statistics by departments 
MONTH CHE EE ME СМ RI CoED I&CS SE СЕ CHM СЕМ ESc PHY MATH PE ELC TOTAL 
JULY 1991 2 7 2 n 3 3 3 - 2 5 - - - - 5 - 43 
AUGUST 1991 2: 02. 3.3 1 з 4 7 1 1 - 5 2 2 - 43 
SEPTEMBER 1991 2 5 5 2 3 - 1 - 6 - 1 - - - 2 - 27 
ОСТОВЕК 1991 3 2 8 7 3 2 12 3 - - 7 1 1 1 - 50 
NOVEMBER 1991 - 5 5 ІЗ 1 6 7 2 3 1 14 1 - - 1 59 
DECEMBER 1991 5 4 6 5 4 5 2 5 3 7 3 - - - 2 2 53 
JANUARY 1992 4 8 3 6 6 3 2 - 2 6 4 - - 4 1 2 51 
FEBRUARY 1992 11 8 20 17 10 9 3. 3-5 9 1 2 - 2 2 2 104 
MARCH 1992 3 7 10 10 20 5 4 1 4 6 - 3 2 1 - - 76 
APRIL 1992 4 16 6 4 7 16 6 4 7 8 2 3 2 1 2 - 82 
МАҮ 1992 8 22 16 15 10 15 8 7 10 20 7 - - 2 - - 140 
JUNE 1992 - п 2 3 7 - - 1 2 7 1 5 1 - 2 - 42 
JULY 1992 9 11 33 2 MH 4 22 13 17 1 2. - 2 3 5 - 135 
AUGUST 1992 9 24 22 6 10 1 14 14 13 12 4 2 - 2 1 - 134 
SEPTEMBER 1902 9 9 8 3 2l - 7 4 4 8 2 2 1 4 - 2 84 
OCTOBER 1992 112 89 70 25 17 B RB 39 15 9 15 16 13 4 6 - 456 
NOVEMBER 1992 80 42 29 24 18 34 20 28 20 14 16 6 7 12 6 5 361 
DECEMBER 1992 89 56 55 56 24 57 22 20 10 7 5 9 14 п 2 1 438 
TOTAL : 352 333 302 212 178 168 149 148 130 121 85 49. 49 48 39 15 2378 
PERCENTAGE: (15) (14) (13) (09) (07) (07) (06) (06) (05) (05) (04) (02) (02) (02) (02) (01) (100) 
NAMES ОЕ DEPARTMENTS: CIM — College of Industrial Management MATH -- Mathematical Sciences 


CHE - Chemical Engineering 

CHM - Chemistry 

CE — Civil Engineering 

CoED -- College of Environmental Design 
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CEM — Construction Engineering Maintenance 
ESc — Earth Sciences 

EE — Electrical Engineering 

ELC - English Language Centre 

I&CS — Information & Computer Science 


ME Mechanical Engineering 
РЕ — Petroleum Engineering 
PHY — Physics 

RI — Research Institute 

SE — Systems Engineering 
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The Electrical Engineering and Mechanical Engineering 
departments secured second and third place in the list 
with 333 (14%) and 302 (13%) searches respectively. 
Together, these three departments conducted 987 (42%) 
searches out of the total 2378 conducted searches. The 
number of CD-ROM searches conducted by the three 
departments indicate that they were engaged in more 
research-oriented work in comparison to other 
departments. The remaining 13 departments conducted 
1391 (58%) searches. The details are given in Table 5. 

The RISD is planning bibliographic instruction in 
the use of CD-ROM databases to thirteen departments. 
In addition, various research activities of the three 
departments (CHE, EE, and ME) will be highlighted to 
encourage departments not taking advantage of the 
facility to do so. 


7.5 Statistics by user needs 

Out of the total 2378 searches conducted during the 
statistics period, 2050 (8696) patrons printed search 
results. И is presumed that printing a search result 
indicates that the patrons' needs were fulfilled. It is 
possible that patrons printed research results, but they 
were of no use. Only 328 (14%) of patrons did not print 
search results; hence, it is anticipated that their needs 
were not fulfilled. The details are given in Table 6. 


Table 6. Statistics by user needs 


MONTH FULFILLED NOT-FULFILLED TOTAL 


JULY 1991 27 16 43 
AUGUST 1991 42 1 43 
SEPTEMBER 1991 22 5 27 
OCTOBER 1991 43 7 50 
NOVEMBER 1991 57 2 59 
DECEMBER 1991 51 2 53 
JANUARY 1992 50 1 51 
FEBRUARY 1992 98 6 104 
МАКСН 1992 68 8 76 
APRIL 1992 73 9 82 
МАҮ 1992 125 15 140 
JUNE 1992 38 4 42 
JULY 1992 129 6 135 
AUGUST 1992 119 15 134 
SEPTEMBER 1992 75 9 84 
OCTOBER 1992 373 83 456 
NOVEMBER 1992 307 54 361 
DECEMBER 1992 353 85 438 
TOTAL: 2050 328 2378 
PERCENTAGE: (86) (14) (100) 





8. Conclusion 

An academic library establishing a CD-ROM search 
service must anticipate its impact on library services. 
Online searching, bibliographic instruction, and 
interlibrary loan are some of the services which are 
directly affected by the addition of this new end-user 
technology. CD-ROM user statistics can be applied to 
determine use and staffing patterns for CD-ROM 
installations. Also, selections and cancellation 


May 1993, Aslib Proceedings 





decisions can be made easier using data gathered on 
use, cost, and scope of coverage.! 

The CD-ROM search record can be a useful 
analytical tool. As we learn to interpret the CD-ROM 
search record, it becomes a powerful diagnostic tool in 
the evaluation of the effectiveness of CD-ROM search 
Services and can provide guidelines by which the 
library can be improved. 
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APPENDIX 1 


NOTE: PLEASE READ INSTRUCTIONS TO CD-ROM USERS OVERLEAF 
BEFORE COMPLETING THE FORM 


KFUPM LIBRARY 
REFERENCE & INFORMATION SERVICES DIVISION 


CD-ROM SEARCH FORM 
Date; — 
Name Dept. Time 
Status: [ ] Faculty [ ] Research Asstt. [ ] Graduate 
[ ] Undergraduate Г 1 Staff [ ] Non-KFUPM 
Key words: 


Please mark the databases that you are particularly interested in searching: 


АВІЛМЕОЕМ ONDISC 

APPLIED SCIENCE & TECHNOLOGY INDEX (AST) 
COMPENDEX PLUS | 
COMPUTING ARCHIVE (ACM) 

DISSERTATION ABSTRACTS ONDISC (DAO) 

MATHSCI DISC 

NATIONAL TECHNICAL INFORMATION SERVICE (NTIS) 
SCIENCE CITATION INDEX (SCI) 


see eee 
м 2 2 мы ыы зы 2 2 


No. of citations: Identified Printed, 


If you need help, please contact the Reference Librarian for conducting the search 


Librarian's signature 
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APPENDIX 1 (cont.) 


INSTRUCTIONS TO CD-ROM USERS 


1. Contact a Reference Librarian on reference desk duty to plan 
your search strategy. This will help to retrieve relevant 
references. 


2. Complete the CD-ROM search form before conducting the 
search. 


3. А reference librarian on reference desk duty will start the 
PC, provide the necessary CD-ROM disc for searching, and 
insert it in the CD-ROM drive. 


4. Ifyou need assistance in conducting or ending your search, 
do not hesitate to contact a reference librarian. 


5. Handle the CD-ROM disc with care: 

* Handle discs only by the edges. 

* Don’t write on a disc or its label. 

* Don’t put stickers on discs 

* Don’t use alcohol or other solvents on disc. If a disc needs 
cleaning, use a blow-brush to remove dust or wipe the 
disc radially with a soft lint-free cloth. 

* Keep discs in their plastic cases when not in use. 

* Do not store them in full sunlight or close to heat sources. 


6. Itis the responsibility of the user to take care of the PC, 
CD-ROM disc, and printer while conducting a search. 


7. If there is any security problem, the Manager, Reference & 
Information Services should be informed immediately. 


8. Asingle search is limited to 20 minutes, if someone else is 
waiting. 


9. Printing of search result is allowed free up to 10 pages to the 
KFUPM Library users. Extra pages printed will be charged 
(à) SR0.50 per page. Non-KFUPM users are charged SR1.00 
per page for every page printed. 


10. In order to avoid the computer virus problem, the users are 
not allowed to insert a floppy disc in any CD-ROM 
workstation disk drive, and downloading is strictly not 
allowed. 


11. When the search is complete, please inform the Reference 
Librarian so that the Librarian will park the CD-ROM 
workstation and take out the disc from the drive. 


INST: 
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Telmi: a reusable information retrieval system 


and its applications 
Edmond Lassalle 


Head of Natural Language Processing Research Group, France Télécom, 
Centre National d’Etudes des Télécommunications, 2 route de Trégastel, 22 301 Lannion Cedex 


Introduction 

The use of an information retrieval (IR) system would 
be easier if natural Janguage processing were applied. 
There are essentially two different ways to use NLP 
techniques: as a user interface coupled with a factual 
database, or as an integrated part of a system which 
deals with a textual database. In this paper, two 
approaches are presented, that of MGS, a commer- 
cialized system in use in France Télécom, and that of 
Telmi, a France Télécom research system. Telmi is an 
information retrieval system designed for use with 
medium sized databases of short text. The character- 
istics of the system include fine-grained NLP, an open 
domain and large scale knowledge base, automated 
indexing based on conceptual representation of texts, 
and reusability of the NLP tools. The knowledge base 
is (semi) automatically extracted from a monolingual 
` machine-readable dictionary (MRD). Telmi is 
integrated into a production-scale prototype which 
implements a Minitel Information Service (IS) for the 
use of the general public. France Télécom Minitel? 
and its problems are described, along with the solutions 
Telmi offers. The paper then goes on to describe how 


France Télécom intends to reuse, in a continuation of 
the present project, the Telmi tools in a multilingual 
system, particularly in (semi)automatic data acquisition 
from multilingual MRDs. 


1. The Minitel Service 
The Minitel service was conceived to allow multiple 
access to a large number of private services via a 
videotext terminal. The customers are the general 
public and activities supported include banking, tele- 
booking, and medical or juridical consulting services. 
There is, of course, a charge for most of these services, 
which could be a problem for occasional users if 
complicated payment procedures were involved. The 
system is therefore designed to encourage occasional 
traffic and this now represents a substantial user base. 
France Télécom acts as intermediary between the 
services and the customers. No subscription is required; 
part of the call charges is used to pay for the service. 
Theconnection procedure is also very simple. France 
Télécom provides a videotext terminal free of charge, 
and access is through the telephone network. An end- 
user who wants to connect to a precise service has only 


SERN $ 


banking, booking, 


medical or juridical consulting services, etc. 


“T> (Dial Numbe 
3617 


eneral Public 
5 000 000 
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to dial a single phone number, which connects to a 
videotext access point™. The user then indicates from 
his terminal device, the code characterizing the service. 


1.1 User needs 

Regular users know the codes for the services they 
require, but occasional users need to know which 
service corresponds to their needs, so this had to be 
catered for. A telematic information service was 
provided for this purpose, and is accessible in the same 
manner as other telematic private services. 

Designed for the general public, such a service 
must be simple, and not require any specialist training. 
Moreover, it should be fast and robust. When this 
service was first devised, NLP techniques were not 
available, so it was a keyword based information 
retrieval system. However, the result ofa simple querying 
system was a lack of accuracy, and intractability. 

It was evident that NLP techniques were required 
to accomodate the growing number of services 
(approximately 1,000 per year) and the new domains 
which were evolving. 


2. MGS 
MGS is an information retrieval system which sports 
an NLP user interface. The telematic services are 
classified under different headings and each heading 
corresponds to an activity shared by several services. 
If need be, a service which covers more than one area 
of activity can be classified under more than one 
heading. This classification is carried out manually by 
Telecom staff. 

When retrieving information the query is analysed 
and is mapped to the most relevant heading, and the 


a NLP user interface 
in MGS 
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services classified under this heading are then 
retrieved“), Applied as a user interface, the purpose of 
MGS is simple: the lexical and syntactical analysis 
transform the initial query sentence into a canonical 
representation™, This representation is then treated 
semantically; a thesaurus is used to deduce from the 
analysed form the most relevant heading. If no match 
is possible, a menu-based dialogue then takes place to 
refine the query. 

From the user's point of view, MGS is fast and 
robust, but it does, of course, have some shortcomings. 
As we can see from the example below, this dialogue 
can be arduous: 


Query: garage pour controle de vehicule (check 
point garage) 
Proposed menu: (1) Car Park — (2) garage 


The choice of the Car Park option leads to the 
next menu which contains: 


Property Business, and every other menu option 
except the Automobile Area option 


Because ofthis inflexibility, menu-based querying 
is used as a last resort to strengthen the robustness of 
the system. In normal situations, NLP must be sufficient 
to provide a list of relevant services, but this is 
increasingly unlikely as the number of services, and 
correlatively, the number of headings, increases. 

For the display services to remain acceptable once 
arelevantheading is matched, the NLP mustbe accurate 
enough to match headings sufficiently to queries, 
despite the increase in their number. 

Administratively, there are some shortcomings. 
MGS isan expensive, customized system. For instance, 
the thesaurus is constructed *by hand', and hands are 


List of telematic services 
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not cheap. Moreover, the classification of different 
services into headings and the headings modification, 
requires a semantical analysis which must be coherent 
with the description in the thesaurus. We believe, 
however, that the quality achieved more than 
compensates for the costs and effort involved. 

Another problem which I should mention, is that 
the system is not easily transferable to another shell, 
and there is an urgent need for new information services 
in France Télécom. ў 


3. The Telmi research system 

Telmi is a generic information retrieval system with 
reusable linguistic and semantic data. It is the result of 
a CNET research project. One of the goals of this 
project is to reduce the cost of designing information 
services (mainly the genericity and reusability) and to 
ease the administration of the information services 
systems (automated indexing). The accuracy of the 
information retrieval function is another concern. 


3.1 Telmi: a full system 
Telmi includes an NLP module, an automated indexing 
module and a retrieval module. The NLP and indexing 
modules are used to analyse documents and to build 
the indexed textual database. The NLP and retrieval 
modules are then used to analyse queries and to extract 
relevant documents from the database. 

As Telmi is designed for medium-sized databases 
of short texts, the NLP plays an important role in 


Description of a service 


“2 қ 


% 
K 





disambiguating terms in documents, and translating 
documents into a conceptual representation. 

Disambiguating terms may be unnecessary, 
however, and a simple representation of documents by 
even ambiguous terms may be sufficient for IR 
processing. Croft? shows that the redundancy of terms 
inside documents and the use of statistical properties 
can then make up for the lack of good NL processing. 
Tn the case of short texts, the terms are not redundant 
and disambiguation is needed. 

A conceptual representation of texts is then 
straightforward if a fine NLP is used. Moreover, 
conceptual representation conforms to semantic 
network use. 

Automated indexing then processes all the 
conceptual representations of documents to build the 
database. Salton's vector model‘ is used: the vector 
space is the space for all the possibilities of repre- 
senting document meanings. The initial conceptual 
representation (including dependency between 
concepts) of each document is, in a vector model, 
simplified and transformed in a vector of concepts, 
each component of the vector corresponding to the 
importance (or weight) of the concept in the meaning 
of the document. 

The importance of the concept is evaluated by its 
power to discriminate between documents. Statistical 
properties of the database are used in this case. The 
weight of each concept is referred to as CFxIDF, where 
CF is the frequency ofthe concept in the document and 
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IDF is the inverse of the frequency of the concept 
within the document collection. 

IDF indicates that high frequency concepts in 
document collection cannot effectively discriminate 
relevant documents from non-relevant ones while the 
component CF indicates that a concept is a characteristic 
of a given document if it occurs frequently in this 
document. 

During the retrieval phase, the query is analysed in 
the same way by the NLP module and is transformed 
into a vector of concepts. The vector representing the 
query is then compared to vectors representing the 
documents. The cosine criterion is used to order the 
services by relevance ranking. To limit the number of 
comparisons between the query and the documents, 
file inversion is used". 


3.2 The NLP architecture 

The NLP module is composed of lexical, syntactic and 
semantic analysis tools. The characteristics of this 
module are: fine-grained natural language processing, 
an open domain and large scale knowledge base, and 
reusability of the NLP tools and data. 

To be reusable, linguistic and semantic data must 
deal with an open domain, since some applications 
also work in this way. 

The acquisition of the linguistic and semantic data 
represents an important initial investment. To reduce 
the cost, this knowledge base is (semi) automatically 
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extracted from a monolingual machine-readable 
dictionary (MRD). 


3.2.1 The lexical level 

The lexical database contains 60,000 lexical entries 
(500,000 inflectional forms due to important 
inflectional properties in French verbs). Though the 
lexical database is large, it must be resident in the 
RAM memory to speed up the lexical access, and a 
special compression algorithm has been devised for 
this purpose. 

Obviously, any lexical analysis must be able to 
deal with problems like misspellings and typographic 
errors, not to mention neologisms. 

With misspellings, people tend to have written the 
words as they pronounce them. In Telmi, the correction 
techniques use the phonetic form as a criterion to 
match the erroneous word to the correct one. The 
grapheme to phoneme conversion mechanism is based 
on learning techniques with a markovian model’. 

Typographic errors are usually due to lack of 
familiarity with the keyboard. The classical s-key and 
0-key corrector systems are avoided due to their well- 
known lack of accuracy when the lexical database is 
large. In Telmi, the lexical database is represented by 
a trie? (retrieval tree) and a tree automation is used to 
retrieve exhaustively the approaching words (the states 
of automation correspond to the different necessary 
operations to detect DIST errors!!. DIST errors 


te., 
o, 






"interlingual" 
Xextual databas 
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correspond to different categories of typographic errors 
that may occur inside а word (delétion, insertion, 
substitution and transposition) 


3.2.2 The syntactic level 

The syntactic analysis uses the context of sentences to 
disambiguate ambiguous terms, and a conceptual 
representation is built during the analysis. This 
representation must be independent from the different 
possibilities for formulating a sentence eg car 
production versus production of cars, or active voice 
sentence vs passive voice sentence. This analysis must 
also deal with different styles of texts. 

Each kind of application has a particular type of 
sentence to be parsed (grammatically). The grammar 
is described in a declarative form in the LFG formalism 
(Lexical Functional Grammar)’. 


3.2.3 The semantic level 

A special compression technique is also used to make 
the semantic network resident in RAM memory 
(100,000 concepts with an average of 3 typed links per 
concept). 

This network is used to disambiguate sentences 
which can’t be dealt with at the syntactic level (the 
prepositional phrase, for example: to eat a fish with 
bones with a fork with a friend). Disambiguation is 
realized macroscopically and uses the coocurrence 
phenomenon. For instance, the sense of program will 


be understood when cinema program or software 


program is encountered. 

The second role of the semantical analysis is to 
deal with the paraphrase phenomenon. For example, 
one can formulate: skin disease or dermatosis or 
psoriasis. This analysis is also made macroscopically 
by propagation of concepts with weight attenuation in 
the semantic network. 


3.3 Anexample of the utilization of a generic system 
Telmi has been integrated into a production-scale 
prototype. The prototype implements the same MGS 
information service as previously described. 

In the commercialized MGS approach, the 
classification of services under headings is, in many 
ways, an oversimplification. Using Telmi, a much 
more detailed approach can be taken. 

All the services connected to the Minitel network 
already have a description of their own activities, 
which includes headings, a title and an abstract. The 
title and the abstract are NL texts related to theactivity 
of the service. 

These texts are analysed by Telmi NLP tools and 
converted into a semantic representation. All semantic 
descriptions are then processed and integrated 
automatically into a textual database. Telmi NLP tools 
are also used in this manner to analyse the user query. 
A match of the resulting semantic representation is 
then processed to providea list of service codes ordered 
by a relevant ranking. 


148 


3.4 The future extension 


. In the continuation of the present aioe the Telmi 


tools will be reused in a multilingual system, particularly 
in (semi)automatic data acquisition from multilingual 
MRD’s, Initially the system will deal with French and 
English, and the goal is to build English linguistic data 
and an interlingual French-English semantic network. 

In this system, French and English documents are 
analysed separately by different NLP modules. The 
French and English modules will use the same analysis 
tools; only the linguistic data will be different. The 
semantic representation of the documents is then 


_ gathered and processed by interlingual semantic and 


indexing modules to build a unique textual database. 
French and English queries will be processed separately 
and then treated by interlingual modules. 


REFERENCES 


1. BAHL, L.R., JELNICK, F., MERCER, R.L. 
A Maximum Likehood Approach to Continuous 
Speech Recognition, JEEE Transactions on Pattern 
Analysis and Machine Intelligence, vol.5, no.2, 1983. 
pp.179-190. 

2. BRESNAN, J. The Mental Representation of 
Grammatical Relations, MIT Press, 1982. 

3. KROVETZ,R., CROFT, W.B. Lexical Ambiguity 
and Information Retrieval, ACM-TOIS, vol.10, no.2, 
April 1992. pp.115-141. ` 

4. КАҮ, М. Algorithm schemata and Data Structures 
in Syntactic Processing, in Readings in Natural 
Language Processing, B.J. Groxz et al. ed,. Morgan 
Kaufman, 1986. pp.35.70. 

5. POLLOCK, J., ZAMORA, A. Automatic Spelling 
Correction in Scientific and Scholarly Text, 
Communications of the ACM, vol.27, no.4, 1984. 
pp.358-368. 

6. SALTON, G., McGILL, M.J. Introduction to 
Modern Information Retrieval, McGraw Hill, 1983. 
7. FURNAS, W.G., LANDAUER, T.K., GOMEZ, 
L.M. and DUMAIS, S.T. Statistical Semantics: 
Analysis of the Potential Performance of Key-Word 
Information System, The Bell System Technical 
Journal, vol.62, no.6, 1983. pp.1753-1806. 

8. FALOUTSOS, C. Access Methods for Text, 
Computing Surveys, vol.17, no.1, 1985. pp.3-74. 

9. PERENNOU, G. Verification et Correction 
automatique de textes, 7.57, vol.5, no.4, 1986. pp.1986. 


- 10. RIJSBERGEN, C.J. Information Retrieval, 


Butterwoths, 1979. 

11. WAGNER, R.A. The String-to-string Correction 
Problem, J. ACM, vol.21, no.l, January 1974. 
pp. 168-173. 


(i) videotext terminal and service 

(ii) in a simplified description of the Minitel network 

(iii)Factual criteria such as geographical criteria may be 
used at this step to choose the most relevant services. 

(iv) Different variants of a sentence (plural form vs singular 
form, inflectional forms of verbs such as tense, mode, 
eic.) are transformed in a canonical form (via the 
so-called lemmatization operation). 


Aslib Proceedings, vol.45, no.5 


CEC PUBLICATIONS ,*"*, 


Aslib, The Association for Information Management, in its role as a National 
Awareness Partner, distributes the following publications on behalf of 
the Commission of the European Communities Directorate General 
for Telecommunications, Information Industries and Innovation (DG XIII) 


IMO REPORTS 


Six Information Market Observatory (IMO) working papers are produced each year. The unit cost is £8 
or 10 Ecu, £48 or 60 Ecu annually. A copy of each working paper is available free of charge to Corporate 
Members of Aslib, The Association for Information Management. 


Probable 1993 topics include: Optical publishing markets in Europe; Aslib 
Information Market Policy development — USA, Japan, Europe; Corporate Members 
Online information services; Results of user and executive panel surveys; F R E E 
Information services industry in Japan (update); 

The market for chemical information. 











1993 WORKING PAPERS 


NEW Summary results of market survey on chemical information users 
— IMO working paper 93/1; February 1993; 10рр 


1992 WORKING PAPERS 


Electronic information services in the EFTA countries 
— IMO working paper 92/6; December 1992; 10рр 


Overview of the eastern bloc on-line information services market 
— IMO working paper 92/2; March 1992; 8pp 


Overview of the EC videotex market 1990/91 
— IMO working paper 92/3; May 1992; 7pp 


Structure and development of the European electronic information services 
industry in 1989/90 — IMO working paper 92/5; October 1992; 8pp 


Summary results of 1991 executive (information use) panel 
— IMO working paper 92/1; January 1992; 10рр 


Summary results of 1991 (electronic information services) user panel survey 
(second wave) - IMO working paper 92/4; July 1992; II pp 


The Information Market Guide contains 580 pages of information products and services which are publicly 
available in Europe and which do not require specifically dedicated equipment to access. This one-stop 
information tool has been collected via a network of national correspondents organized by IMPACT. £40 or 50 Еси. 


For orders and subscriptions contact: 


THE ASSOCIATION FOR INFORMATION MANAGEMENT 


INFORMATION House, 20-24 Ою Street, LoNpoN ECIV 9AP 
TELEPHONE: 071 253 4488 Fax: 071 430 0514 





801F0393 


4 






PRE- 


Museum and | 
Special Collections in 
the United Kingdom. 

edited by Keith W Reynard | 





ө Where do you find collections of artifacts in the 
- UK? e Looking for I9th century canal boats? 

@ Looking for bangles, baubles or beads? e Where 

are the main collections of anglo-saxon coinage? 


This useful guide compiled in association with 
the Museums Association lists museum and 
other collections ofartifacts and in particular those 
collections which are unusual, important or 
. especially noteworthy. 


The directory will cover all subjects except living 


organisms with an extensive subject index. 


Over one thousand ‘sources will be listed with . 


location details including access criteria. 





Museum and Special Collections in the United Kingdom 


Recommended retail price: £124 


Pre-publication price: £110 





Both directories will be published in October, 1993. 


Take advantage of the pre-publication offer for pre-paid 
orders received prior to publication М ordering NOW. 


To order contact 


. 702F0493 





PUBLICATION 


_ OFF ER... TWO NEW RECTORES] 


Aslib announces a special 
pre-publication offer | 

on these two new useful 

reference directories 


7297x 210mm; October, 1993 0851423086 paperback ` 


ÍNFORMATION House: 20-24 Ою STREET, LoNpoN EC IV.9AP ж 
TELEPHONE: +(44) 71 253 4488 Fax: +(44) 71 430 0514 












Aslib Directory of: 
Literary and Historical | 
Collections in the UK 


edited by Peter Dale 





This companion to the museums" directory will © >~ —— 
contain details of organizations which hold rare, · . - . 
large or important collections of orent manuscripts ` 


ánd other texts. 


The scope is broad editing literary, historical, ` 


political, sociological, scientific, technological and 
engineering writings and collections. 


Up to 1000 sources will be listed with an extensive 


_Subject and author index for ease of reference. - 





- Aslib Directory of Literary and Historical Collections in the UK 


297x210 mm; October, 1993 0851423094 paperback 





Recommended retail price: £124 
Pre-publication price: £110 





Aslib Corporate Member 





DISCOUNT 20% 

























































aa 
pr 


2) THE ASSOCIATION FOR INFORMATION MANAGEMENT 


Р” 4 





Co 
n 
2 
со 
e 
© 
№ 
CA 
со 
x 





* Asus ÉLECTRONICS GROUP 
e * CONFERENCE 1993: 

+. ELECTRONICS AND о 
А INFORMATION MANAGEMENT i Р Sue e 

















+ 
: B NUMBER 6 
per June 1993 
- ЯҒЫ P 15 a $ i гре . etm ` " 
PUTES, > Des 


Aslib Proceedings carries papers given at Aslib meetings and conferences, and contributed 
papers of interest to practising professionals. These describe innovations in information 
management techniques, applications of existing information products and services, and 
case studies of information units and special libraries. 


Aslib Proceedings is published monthly and supplied free to members of Aslib. Annual 
subscriptions to non-members: £110 (UK)/£120 (overseas). Single copy prices and 
subscription information are available from the Subscriptions Department at Aslib. 
Advertising information is available from Brian Thackray, Publications: Department. 
Further information about Aslib's services and membership rates will be supplied on 
application to the Membership Department, Aslib, The Association for Information 
Management, Information House, 20-24 Old Street, London EC1V 9AP. Telephone: 
071-253 4488; telex: 23667 (answer-back Aslib G); fax: 071-430 0514. 


Manuscripts and editorial enquiries should be addressed to the Editor, Moira Duncan. 
Manuscripts should be 1,000 to 3,000 words, typewritten, double-spaced, single-sided on 
A4 paper. Figures should be reproducible. An abstract of up to 200 words should be 
included. References should follow British Standard 5605: 1990. Published authors will 
receive 10 copies of their paper. 


Aslib, The Association for Information Management 
has some two thousand corporate members worldwide. 
It actively promotes better management of information 
resources. 


Aslib lobbies on all aspects of the management of and 
legislation concerning information. It provides 
consultancy and information services, professional 
development training, specialist recruitment and 


publishes primary and secondary journals, conference 
proceedings, directories and monographs. 


Further information about Aslib can be obtained from: 


Aslib, The Association for Information Management 
Information House 

20-24 Old Street 

LONDON 

ECIV 9AP 

Tel: 071-253 4488 








€ 1993 Aslib and contributors ISSN 0001 253X 
Printed in Great Britain by Chappell Graphics and Print Services 


Aslib 
PROCEEDINGS 


Contents 


The great electronic information bazaar | ` 153—159 
— а rough guide to exploring the Internet 
Тап Watson 


Electronic document delivery: a reality at last? 161 — 166 
J Andrew Braid 


Virtual reality: toys or tools of the trade? 167 — 181 
Robert J Stone 


The office you wish you had 183 — 186 
Peter Cochrane, Kim Fisher, Rob Taylor-Hendry 


Aslib Proceedings, vol.45, no.6, June 1993. pp. 151-186 


The great electronic information bazaar — a rough guide to exploring the Internet 





The great electronic information bazaar 
—a rough guide to exploring the Internet 


Jan Watson 


The Turing Institute Ltd, 36 North Hanover Street, Glasgow G1 2AD 
Paper presented at the Aslib Electronics Group Annual Conference held at Danbury Park Management Centre, 


Chelmsford, 13-15 May 1993. 


Abstract 


The Internet is a decentralized network of computers located throughout the world. Many of these machines (or 
servers) contain information which is freely available while others require payment or at least some form of 
authorization to log in. The growth of the network in recent years has opened up new ways of storing and accessing 
information and presents a challenge for anyone involved in information work. Improvements in telecommunications 
will see the Internet develop into a vital piece of information infrastructure through which it will be possible to 
transmit not just text but images and video. This paper provides a brief overview of the Internet: what it is, where it 
came from and what it offers. It introduces some ofthe tools that have emerged in recent years to help find and retrieve 
information from the many servers throughout the world. It also provides hints on where to look for more information 
on getting connected. In conclusion some comments are made on the relevance of the Internet for the information 
community and attention is drawn to some policy developments in the USA and the UK. 


Introduction 
‘It is hard now to imagine the difficulty of 
learning to start, drive and maintain an 
automobile. Not only was the whole process 
complicated, but one had to start from scratch... 
With a blank belief that it would not run at all, 
and sometimes you were right.... It required not 
only a good memory, strong arm, an angelic 
temper, and a blind hope, but also a certain 
amount of practice of magic, so that a man 
about to turn the crank of a Model T might be 
seen to spit on the ground and whisper a spell’ 


This is how John Steinbeck, writing in East of Eden 
described the operation of a motor car in the 1930s. He 
goes on fo describe how a sage, in the form of a 
mechanic, patronisingly explains to the proud owner 
and would-be car traveller how to make his pride and 
joy work. This passage could easily describe computer 
technology ten years ago. The data processing 
departments, which controlled use of the machines, 
were very powerful and anyone wishing to use a 
computer had to treat this department, and the sages 
whoranit, with some reverence. During the last decade 
arevolution has taken place in which the computer has 
moved out of the data processing department and into 
the home and office. At the same time the power of 
personal and office computers has increased 
dramatically and improvements to the user interface 
have greatly improved usability. A bigger revolution — 
the computer networking revolution — is now under 
way as more people realize the potential of these 
machines as information appliances rather than mere 
word processors or spreadsheet number crunchers. 
Plugged into a telephone line and accessing information 
of almost any kind the computer becomes a tool of 
power and value. Of course computers have been 


networked for more than 25 years but it is only in 
recent years that networking has become accessible to 
home and business users, largely as a result of 
improvements in telecommunications and communi- 
cations software. Steinbeck's account of motor car 
technology in the 1930s could now describe the state 
ofthe art in networking: it all seems a bit of a mystery, 
shrouded in technical terms and apparently requiring 
some knowledge of black arts. In fact, although there 
is quite a lot of computer jargon attached to networking, 
plugging into the Internet to find and exploit the vast 
amount of information available is getting easier and, 
just as user interfaces have improved the way we use 
computers, we can expect rapid developments in the 
technologies that will help us exploit the potential 
offered by networking. At present, though, the interfaces 
are still a bit primitive and some 'cranking of the 
handle and spitting on the ground’ is still necessary. In 
spite of this the information community is becoming 
much more aware of the potential of the Internet and 
we should be improving our knowledge of what is 
there and how to use it. Much ofthe literature about the 
Internet makes use ofthe travel analogy and this paper 
continues in that tradition adding the analogy of the 
bazaar: a chaotic market place, apparently lacking 
structure or order, butoffering something foreveryone. 

This paper is aimed not at computer fanatics but 
rather at the growing number of people who have 
access to computing facilities and would like to use 
such facilities to help them do their job: people like our 
hero in the opening quotation who would like to start 
travelling rather than fiddling about under the bonnet. 
Itwill describe what we mean by the Internet, introduce 
the terminology; describe some tools emerging to help 
manage the Internet information explosion and present 
an overview of the kinds of information that can be 
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found on it. As a kind of basic travellers guide it will 
point the reader in the direction of Internet travel 
agents, ie connection services, refer to more detailed 
sources of information and provide guidance on how 
not to offend the residents of machines that you might 
visit on your travels. 


What is the Internet? 

The Internet is a loose amalgam of computer networks 
connecting thousands of sites and millions of users all 
over the world. It really began in 1969 as a US 
government experiment in packet switched networking 
called ARPAnet (US Department of Defense Advanced 
Research Projects Agency Network) which was 
designed to support military research. In particular, 
this experiment was to focus on research on how to 
build networks that could withstand partial damage as 
a result of hostile action..In this model the network is 
assumed to be unreliable: a computer only has to put its 
message in a ‘packet’ (envelope) and send it to an 
address (another computer). The network will find a 
way to deliver the message. In the 1970s other co- 
operative networks such as UUCP (UNIX to UNIX 
copy Protocol), USENET (Network News), BITNET 
came into being, mostly to serve the needs of the 
academic community. An important step forward took 
place in 1986 with the birth of the NSFNET (The 
National Science Foundation Network), a bold initiative 
to link together US researchers through 5 super- 
computer centres. A number of regional networks 
were established within which each site would 
communicate with its nearest neighbour in order to 
minimize communication costs. The NSF promoted 
universal educational access by funding campus 
connections only ifthe campus had a plan to spread the 
access around. A suite of communication protocols — 
known as TCP/IP (Transmission Protocol/ Internet 
Protocol)—was developed so that messages originating 
on one kind of machine could be understood by any 
other kind of machine. Any computer with this 
communications protocol is capable of plugging into 
the Internet. 

Until recently the Internet was mainly inhabited by 
researchers and computer fanatics who had neither the 
interest nor the need to develop friendly interfaces. 
This is changing as computer networking becomes 
attractive to a wider community and the electronic 
equivalent of cheap charter flights appears on the 
market. Librarians and other kinds of information 
specialists are playing a part. In the UK the JANET 
User Group for Libraries (JUGL) has promoted the use 
of JANET (the UK part ofthe Internet) as a tool for the 
academic library community. Out of JUGL has grown 
BUBL (Bulletin Board for Libraries) which is run 
jointly by the Universities of Glasgow and Strathclyde. 
This bulletin board contains, amongst other things, a 
substantia] amount of information about the Internet 
(more information available from Dennis Nicholson at 
the University of Strathclyde, Andersonian Library, 
101 St John's Road, Glasgow G4 ONS — email 
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cijs03@vaxa.strath.ac.uk). Computer scientists, too, 
are recognizing the need to develop better interfaces 
and information retrieval tools. On the network there 
are discussions involving librarians, subject specialists, 
technicians, programmers, user interface designers etc 
aimed at improving the service. 

The UK Office for Library Networking has recently 
published a report! highlighting the importance of 
networks and calling for a national policy for the 
development of the physical network and for the 
services that run on it. In the United States the Clinton 
Administration is reported to be heavily committed to 
theInternet: as Senator Gore, the current Vice President 
introduced the National High Performance Computing 
Act which established NREN, a high bandwidth 
National Research and Education Network which 
will offer greater interconnectivity and higher 
communication speeds and is seen as a key part of the 
USA's economic infrastructure. Discussions are taking 
place on the creation of SuperJANET which is the UK 
equivalent, a topic we shall return to later. 


What does it offer? 

A connection to the Internet gives a user real-time 
access to online databases, library catalogues, software 
archives, full text reports and many other kinds of 
information. Accurate statistics are difficult to come 
by: a recent estimate suggests that there are about 75 
networks and about one million computers from which 
an unknown number ofusers sent 15 billion packets of 
data during one month. It is also reported that the 
Internet gives access to over 21 million files held at 
more than 1000 sites. Between 147,000 and 168,000 
of these files are text and the range includes fiction, 
poetry, newsletters, directories, lists, indexes in various 
formats: ASCII, PostScript, hypertext stacks, SGML 
texts’. 

At the Turing Institute we regularly obtain the full 
text ofreport literature from sites on the Internet. In the 
past we would obtain lists of reports and send off 
orders for those of interest, often enclosing a fee if 
required. This could take weeks or months. Now we 
can log on toa site in the USA and use a facility known 
as ftp (see below) to obtain an electronic version of the 
report, usually in PostScript, which can be printed 
locally. The whole process might take 20 minutes or 
less. Further, we do not have to store a hard copy in our 
library and worry about losing it: we can get another 
copy on demand as and when needed. This kind of 
operation provides a glimpse of what is being referred 
to as the virtual library, one in which it is not necessary 
to build a collection locally but rather to build the 
know-how and means to find what you need from a 
virtual bookshelf somewhere out in cyberspace (you 
do not even need to know the geographical location of 
the source). 


How is the Internet run? 


According to Krol’ the Internet is like a church run by 
a council of elders in which every member may have 
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an opinion on how it should work and is free to take 
part or not as a matter of choice. The Internet, however, 
has no president, chief executive or Pope. The ultimate 
authority is the Internet Society, or ISOC, a voluntary 
membership organization set up to promote global 
information exchange through Internet technology. 
The Internet Architecture Board, JAB, is a group of 
invited volunteers which meets regularly, in Krol’s 
analogy, to ‘bless’ standards and allocate resources. 
The Internet works because of these standards which 
allow computers of different makes to communicate 
with each other. The main standard for the Internet is 
TCP/IP (Transmission Control Protocol/ Internet 
Protocol). The ISO (International Standards 
Organization) has been working for many years towards 
establishing an Open Systems Interconnect (OSI) 
standard to replace the TCP/IP but whether this will 
happen is the subject of hot debate and some 
controversy. Happily these highly technical matters do 
not concern us travellers much: they will affect the 
way the whole thing works, but it is not our concern 
here to get our hands dirty under the bonnet, or worry 
about how to build the road. 


Who pays for it? 

The individual networks all have their own financing 
arrangements. In the UK JANET (Joint Academic 
Network) is financed by the Universities and the 
Government, in the United States the NSF (National 
Science Foundation) funds the NSFNET. They all 
agree how to fund interconnections in much the same 
way the postal authorities agree how to pay for the 
carriage of our postcards from Malaga to Manchester. 
The situation is changing, or rather developing, so that 
non-academic bodies can pay for various kinds of 
Internet connection. In fact, many large corporations 
have been on the Internet for many years, although 
most use was by their research Jaboratories. This was 
— and is — legitimate use of the network because the 
funding agencies decreed legitimate use as any activity 
in support of research or education. Use by other 
departments, such as legal oraccounting, would require 
payment. The increasing ubiquity of computers means 
that other departments are moving in, and as the 
Internet drops its traditional hostility towards 
‘commercial’ users we can expect to see an increase in 
this kind of use. The political support of the US 
government is an important factor too. 


Connection to the Internet 

There are different ways of starting your journey on 
the Internet — that is, getting your machine connected. 
Some people are already connected but do not know it. 
If your system has a TCP/IP connection facility then 
you are connected. The best thing to do is ask your 
system administrator. If you work for a small 
organization or from home then the chances of being 
connected are minimal. A full connection means that 
your machine is connected 24 hours a day to a 
communications link through which data can flow in 
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and out continuously. Àn alternative is where you dial 
up an access service (like DEMON) and through that 
system connectto whatever Internet services you need. 
The key difference is that in the second method it is 
necessary for you to make a connection: electronic 
mail will not automatically arrive on your machine and 
youare not in real-time connection with the millions of 
other users. That does not mean that this is not a useful 
method of connection: in fact it can be the most cost 
effective way to get started. A list of Internet connection 
services is given at the end of this paper and books 
such as Krol* offer very readable accounts of the 
different kinds of connection. 


The basic tools 

This section will explain briefly the main tools for 
logging into other machines and for searching around 
to find out what is available. These are: 


Telnet a device (or protocol) which allows 
you to log into other computers on 
the Internet. 


ftp File Transfer Protocol — allows you 
to transfer files from a remote 
computer to your own. 


Electronic Mail For exchanging messages between 
individuals or groups. 


USENET News Discussion groups to cater for all, 
including the most exotic interests. 


Gopher A menu-based system for travelling 
around — sometimes referred to as 
‘tunnelling’ through the Internet. 


Archie A system for locating files that are 
publicly available by ftp. 

WAIS Wide Area Information Server. A 
free text method of searching. 

www World Wide Web. A hypertext 


system for subject searching. 


Telnet 

Telnet is a device for logging into another computer 
which may be in the same room or thousands of miles 
away. The user can use whatever services are provided 
on that machine, for example it could be an online 
public access catalogue (OPAC), it might be a 
commercial database host like Dialog (for which you 
will of course require an account) or it might be a 
gopher (see below) which will point to many other 
sources of information. To use Telnet it is necessary to 
have the address of the remote computer. This usually 
takes a form something like sonne.uiuc.edu and the 
command is simply telnet sonne.uiuc.edu. 

A full explanation of the naming convention is 
beyond the scope of this paper but suffice to say that 
the above address denotes a machine called sonne 
located at uiue (the University of Illinois at Urbana- 
Champaign) which is an educational establishment. In 
the UK a typical address might be rsl.ox.ac.uk which 
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translates to the Radcliffe Science Library at Oxford 
University which is an academic organization in the 
UK. Do not worry if you sometimes see addresses 
written the other way round: this stems from. Europe 
and the USA having adopted different conventions. 
The network now seems to be able to handle either 
version. 


ftp (File Transfer Protocol) 

This is a device for transferring a file to your computer 
from another one. It is sometimes referred to as 
‘anonymous ftp’ because to use it you do not need an 
account on the other computer — you log in 
anonymously, but it is considered polite to enter your 
email address at the password prompt so that your host 
at least knows who the guest is. There are many kinds 
of information you might want to transfer: software, 
opinions, technical papers, standards, song lyrics, banjo 
tablature etc. 


Electronic mail 
This is probably the easiest device to understand and is 
the first encounter many people have with the Internet. 
Email, as it is usually called, is for sending messages to 
specific users or groups of users. Email as a relatively 
new medium has some pitfalls — because it tends to be 
informal it can be easy to offend or convey a slightly 
different meaning from that intended. For this reason 
the smiley face has evolved as a way of conveying 
subtle shades of meaning. Here are two examples 
(look at them sideways!) 
:-) I am being humorous, don’t take this too 
seriously 
ғр) A wink: ‘catch my drift?’ as in: 
Joe and Anne spent a long time together last 
night working on their presentation ;-). 


Happily these are not compulsory and many of the more 
literate users coming onto the Internet will prefer to 
employ their own more traditional skills of composition 
to communicate subtle shades of meaning [ ;-) |. 
There are sources of help on netiquette, as it is 
sometimes called, either by ftp or from USENET News 
(see below). 

Some advice on email is worth heeding. Email is 
not very secure and it is considered wise not to commit 
to email anything that you would not want to become 
public knowledge. Machines are fallible and your 
message could end up in someone else’s mailbox, for 
example, the system administrator’s. Worse, if you 
reply to a message that was sent to a group of people 
your reply might end up being sent to everyone in that 
group. Also email is sometimes caught in system 
back-up tapes from where they can, if necessary, be 
retrieved. This, apparently, is how Oliver North’s 
connection to the Iran-Contra affair was documented. 

An email address looks like a telnet address with a 
prefix to indicate the person at the site. For example, 1 
am ian@turing.com. Email is like the postal service 
in that it operates a ‘store and forward’ network: you 
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put your letter іп an envelope, put the addressed 
envelope in a post box and the Post Office takes 
responsibility for routing the letter through various 
nodes from where it is forwarded by various modes of 
transport — van, train, air, bicycle, donkey or foot. 
Similarly, you give your email message an address and 
the network will try to find a route. Of course it might 
fail and your message will be returned, usually to you 
but sometimes to your system administrator, so 
remember to be careful what you write! 

An interesting development from email is the 
mailing list. Interest groups set up electronic mailing 
lists to which you can add your name and receive 
messages from anyone else in the group. You can reply 
or ask your own question of the group. In the UK there 
is a service called Mailbase (The NISP Team, The 
University, Newcastle upon Tyne, NE] 7RU. Email 
Nisp-Team@uk.ac.mailbase. Tel 091 222 8087.). If 
you send a message to mailbase@uk.ac.mailbase 
with the message lists you will receive a list of all the 
existing lists. For example LIS-ALL is a list set up to 
facilitate communication between the library and 
information community. There is also a SuperJANET 
list where you find a lot of useful comment on the 
future direction of the UK bit of the Internet. BUBL 
(The Bulletin Board for Libraries) is a good way of 
keeping up to date with new lists throughout the world. 


USENET News or Network News 

USENET News, in fact, pre-dated the Internet and 
many people think that it is the Internet. It is really a 
collection of bulletin boards or discussion groups on 
almost every imaginable topic, from logic programming 
to heavy metal and from religion to pornography. 
News groups grew from small groups of users who 
realized that they could quickly and efficiently 
exchange news, views and technical know-how and 
subjects in which they shared a common interest. 
There are seven official news categories which are 
known as USENET News. These are well-managed 
and have well-defined rules for creating new groups 
within each category. You need a news reading program 
and the ease with which you can participate in the 
discussions will depend upon the program and the 
terminal at your disposal. It is a very good way of 
keeping informed or seeking opinions, views or 
information. There is a lot oftrivia but itis an excellent 
way of getting to know what is out there. Many people 
even strike up friendships! The seven major USENET 
News categories are: 

comp Computer Science 


news News about the news network 
rec Recreational 

Sci Science 

soc Social 

talk Talk or debate 

misc Miscellaneous 


Each of the seven official categories are subdivided 
into many specialist groups, for example: 
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comp.ai.neural-nets 

comp.multimedia 

comp.binaries.ibm.pc 

rec.music.folk 

rec.music.dylan 

Users simply choose which groups they wish to 
‘subscribe’ to and then read the comments posted by 
other subscribers. You аге free to join in by answering 
questions, posting questions, or making comments — 
or you may choose to say nothing or ‘unsubscribe’. 
Note that there is no fee to subscribe. New groups are 
set up if there is sufficient demand and a volunteer 
agrees to be the list owner. The owner’s responsibility 
is to keep order and maintain decency. 

It is possible for local administrators to set up local 
groups and this is quite common. Agreements are 
often reached between administrators to feed these 
news groups to other sites which have an interest. Now 
there are many useful local groups distributed almost 
as widely as the core USENET groups. The most 
common group in this category is alt — groups that 
discuss alternative ways of looking at things. Anarchy 
tends to prevail in these groups and for some the 
normal netiquette may be set to one side. There are 
some really bizarre groups in this category: 

alt.fan.monty-python 

alt.alien.visitors 

alt.sex.bondage 

Some alt groups eventually gain respectability and 
migrate to the official USENET groups, eg alt.gopher, 
which was set up to exchange news on one of the first 
tools for searching the Internet, Gopher. 

Some groups are moderated which means that a 
moderator or controller vets all posting to the group. 
An example of this kind of group is comp.news. 
newproducts. The moderator has evolved rules and 
guidelines to help this group effectively disseminate 
information about new products without it becoming 
cluttered with electronic junk mail. Most groups, how- 
ever, are unmoderated and rely on self discipline to 
keep order. Some participants tend to make contentious, 
dismissive, rude and even offensive remarks in discus- 
sions (in much the same way that some easy-going 
individuals undergo a Jekyll and Hyde kind of trans- 
formation when they get behind the wheel ofa car). A 
phenomenon known as flaming has evolved in which 
a subscriber is so enraged by a particular statement or 
opinion that he/she sends offa stinging rebuke, A reply 
to the rebuke may follow which raises the temperature 
and others may join in leading eventually to a second 
phenomenon known as a flame war. Wars have been 
known to occur on almost any subject (is the Amiga 
better than the Commodore? is Bob Dylan a folk 
singer?) and usually continue until everyone loses 
interest or the group owner (there is an owner for every 
official USENET group) intervenes requesting peace. 
The best advice is not to get involved in flaming and if 
someone flames you, do not reply immediately. By the 
time you have thought about it someone else will 
probably have demolished your assailant for you. 
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New users are therefore encouraged to take heed 
of some basic guidance (a kind of Highway Code) 
which is readily available in a newsgroup called 
news.announce.newusers. Here is where you will 
find FAQs (Frequently Asked Questions) and much 
more friendly and helpful advice on how things work. 
As with email, users are reminded to be careful to 
avoid causing offence or embarrassment: one dictum 
is never to post anything you would not be happy to see 
on the front page of tomorrow’s newspaper. 

On a more serious or business-like note some 
commercial information services are distributed 
through network news. ClariNet Communi- 
cations Corporation (Waterloo, Ontario — 519 884 
7473 —email grant@looking@ca.uwaterloo.watmath) 
distributes Newsbytes, an electronic newspaper devoted 
to computing and telecommunications. Its news stories 
are organized into a number of groups: IBM, Apple, 
Government, Telecommunications, Business, Trends 
and UNIX. Users must pay a subscription to Clarinet 
who in tum have made an arrangement with the various 
networks for distribution. Subscribers receive a daily 
feed of news stories — a very useful way of keeping 
abreast of technical and commercial developments. 


Archie 

Archie was developed by the School of Computer 
Sciences at McGill University to solve the problem of 
finding information which is stored in the many public 
access archives on the network. It indexes about 1200 
servers and 2.1 million files*. There are a number of 
different ways ofusing Archie but essentially you give 
it a character string and it returns the names of files 
which match, together with the ftp address of the site. 
You then use ftp to fetch the files you want. 

Some background information might be helpful. 
Archie is a pair of software tools: the first maintains a 
list of about 900 Internet ftp archive sites. Each night 
this software executes an anonymous ftp to a subset of 
these sites and fetches a listing of files, which it stores 
in a database. The second tool is the tool to query the 
database and in fact this is called Archie. It is actually 
quite a crude tool as it is not able to search for the 
content of the files, but only for their names. 


Gopher 
Gopher is a menu-based system which automatically 
makes use of ftp and telnet to find information. It was 
developed at the University of Minnesota as a campus- 
wide information system to enable staff and students 
to access files on many different computers with no 
training. The gopher program can be found using 
Archie and retrieved using ftp. Recently many gophers 
have sprung up at sites such as Oxford and Bradford 
Universities. Connection can be made by telnet. 
Gopher suffers from the problems that beset most 
hierarchical menu systems: it very easy to burrow and 
become lost somewhere deep in the hierarchy. To help 
overcome this a new tool has emerged called Veronica, 
a Very Easy Rodent-Oriented Net-wide Index to 
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Computerized Archives. Veronica offers a keyword 
search of most gopher-server menus in the entire 
gopher web. As Archie is to ftp archives, Veronica is to 
gopherspace. Frustrated comments in the net news- 
groups reflected the need for such a service. Veronica 
indexes the titles on all levels of the menus for most 
gopher sites in the Internet. Like Archie it does not do 
full text searches of the data and so in that sense is a 
bit limited. 


WAIS 

WAIS (Wide Area Information Server) improves on 
Gopher in that it is not menu based. Essentially it 
searches the menus for you. The software is available 
on the network and a detailed description of how it 
works can be found in Krol* or any of the Internet 
books currently appearing on the market. 


WWW 

World Wide Web is a hypertext overview of 
information available on the Internet and is being 
developed in Switzerland at CERN, the European 
Particle Physics Laboratory. This has pointers to 
other WWW sites, and remote index servers and 
information servers. It is possible to see the simplest 
browser by telnetting to info.cern.ch. In fact, when 
you do that, you can browse through a lot of information 
on the World Wide Web, including technical details of 
the Web itself. You can pick up browser and server 
code from the same site by anonymous ftp. Again a 
more detailed description is beyond our scope and 
the reader is referred to one of the sources of further 
reading. 


What does it mean for the information community? 
When computers demonstrated their superiority in 
manipulating figures, accountants moved up not out. 
They succeeded in getting into senior management 
through their computer-driven power to manipulate 
figures. Librarians might follow this example by using 
the power of WAIS servers, WWW etc to get 
themselves into strategic information management 
positions. So much information is now held on 
computers and so much of it is being made available 
free of charge on public information servers and ftp 
archives. The Internet is now being likened to a 
superhighway along which you can travel using 
` -transport that suits your finances to the destinations 
that take your fancy’. In North America the Computer 
Systems Policy Project (CSPP) — an industry lobby 
representing Apple, IBM, AT&T, DEC, HP, Sun and 
others — presented a report to President Clinton in 
which it argues that the creation of a national 
information infrastructure should be a national priority 
and that it should be developed in partnership with the 
private sector. This infrastructure would enable the 
rapid and routine transmission of text, images and 
video. As we noted earlier, Clinton and Vice President 
Gore are enthusiastic supporters of the prcposed high 
bandwidth National Research and Education Network 
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(МЕКЕМ). Sadly the situation in the UK is different. 


' The UK Office for Networking report referred to 


earlier" advocates the development of SuperJANET 
as the high bandwidth information infrastructure 
capable of supporting images and video. The report 
notes that JANET, the academic network, could be 
opened up to fulfil the role of the UK equivalent to 
NREN in the USA. While this is encouraging, it is 
discouraging to note that, while the CSPP report in the 
USA was picked up by the mainstream press and has 
the support of the Government, the UK report was 
hardly noticed, being launched at a *British Library 
gathering devoid of professional politicians and 
computing moguls'6. This seems to suggest that the 
support in the UK for the development ofan electronic 
infrastructure is not as widespread as in the USA, 
where Clinton is offering tax incentives to businesses 
to help them acquire the technology to allow them to 
plug in’. 

All is not gloom, however. The Internet will 
develop further and the tools described here will 
improve through the active involvement of 
information professionals. We noted that no-one owns 
the Internet: virtually all the development has taken 
place as a result of committed individuals responding 
to perceived deficiencies and solving problems as they 
arise. It is truly a bazaar in which librarians and 
information scientists have a role to play as traders 
who know how to come to terms with the electronic 
merchants, tourists, beggars, thieves, fakirs and snake 
charmers who inhabit this electronic market place. 
The best way to find out more is to turn the crank, 
whisper a spell and...... 
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Abstract 


Document delivery is in the process of undergoing many changes from a manual to an electronic system. Two main 
reasons for this change are presented — the financial pressures on library acquisition budgets and the recent advances 
in enabling technology. A history of some of the trials carried out by the British Library Document Supply Centre 
(BLDSC) is given and current plans for services are described. The role of standards and possible barriers to the 
implementation of new technology systems are briefly discussed. 


1. Definition of Document Delivery 
The British Library Document Supply Centre (BLDSC) 
is the largest organization in the world devoted to Inter 
Library Loan (ILL) and document supply. It offers a 
variety of services with a total demand in excess of 3 
million requests per year. About 25% of these requests 
originate from outside the. United Kingdom. 
Traditionally all parts ofthe transaction of requesting, 
processing and delivering a document were carried out 
by manual methods. This includes the use of mail 
services for the transmission of requests as well as the 
despatch of items and the manual retrieval of stock. 
Itisnecessary to carefully define the term ‘document 
delivery.’ In this article it is defined to mean the supply 
of a retention copy of a document; a definition which 
is widely used in the USA. It does not apply to the loan 
of items, although some of the techniques applied, 
especially requesting, can be applied to ILL as well as 
document supply. 


1.1 Elements of document supply 
1.1.1 Identification of information 
The first process is the identification of the required 
information. This can be done by many means 
depending upon circumstances. Many abstracting and 
indexing services are now available in electronic format 
either locally or remotely. These allow the user to 
search and retrieve citations and abstracts of relevant 
information. The method of carrying out this searching 
is, in most cases, dependent on the database which is 
being searched, or that of the database host or system 
employed. This means that the user has to learn and 
remember a great deal of different searching commands 
if different databases are to be searched. Help is at 
hand in the form ofan international standard, the Open 
Systems Interconnection (OSI) Search and Retrieve 
protocol (ISO 10162 and 10163). There is also another 
standard which has more widespread use especially in 
North America. This is the ANSI standard 739.50. 
Many new players are entering the area of providing 
current awareness services of journal articles. The 
reason for this, which is expanded upon below, is that 
libraries can no longer afford to subscribe to as many 





journals as they would like. A substitute for this is to 
provide contents listings of those journals which they 
can no longer afford. Examples of this type of service 
are provided by Online Computer Library Centre 
(OCLC), Faxon, Colorado Alliance of Research 
Libraries (CARL) and the Research Libraries Group 
(RLG) as well as BLDSC. 

Once a search has been performed and a list of 
references has been obtained an assessment has to be 
made of the items in the listand relevant items acquired. 


1.1.2 Identification of source of supply 
When the decision has been taken to acquire an article, 
it is necessary to locate a source of supply for it. There 
are several ways in which this can be done. The most 
automated method is linked with the searching process. 
A command is made on the machine on which the 
search is being carried out and the article is ordered 
automatically. Many systems are now available which 
will carry out this process. The more sophisticated of 
these systems will check local holdings and automatically 
place orders for those items which are not held locally. 
Some systems also contain the holdings of remote 
suppliers. This is often the same as the supplier of the 
current awareness service thus facilitating the selection 
ofa supplier. Some systems even have in-built selection 
algorithms to select the cheapest or fastest supplier. 
If the user does not have access to such a system, it 
is necessary to go to the local library and check if the 
required item is in stock. If it is, it can be retrieved 
manually, if it is not then it must be acquired from an 
external source. In this instance the selection of sucha 
supplier can be more serendipitous. 


1.1.3 Requesting 

Whichever supplier is chosen, other than a local source, 
it is necessary to send an order for the specified item to 
the supplier. There are several ways of doing this. In 
the simplest system, as outlined above, requests are 
sent by post. However, there are now many automated 
systems for the transmission of requests to document 
suppliers. Many of these are proprietary, the largest 
probably being the OCLC ILL system which handles 
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over 6 million requests every year. Many database 
hosts also offer document ordering modules and many 
document suppliers have their own systems; these are 
all proprietary systems. 

Recently an OSI protocol for ILL requesting (ISO 
10160 and 10161) has been developed. To date there 
are very few implementations of this standard outside 
Canada. It would seem a sensible solution to combine 
the process of searching with that of requesting. There 
are signs that this is starting to happen — one of the first 
fully OSI compliant implementations is a European 
Commission funded project involving partners in 
France, the Netherlands and the United Kingdom. It 
will be some time before all users begin to use such 
sophisticated methods of searching and requesting; 
many users will continue to use the more traditional 
methods for some time to come. 


1.1.4 Processing 
When orders are received at the supplier, by whatever 
means, the request has to be processed. Ideally, this 
should be done as quickly as possible, but it is only 
with dedicated suppliers, such as BLDSC, where faster 
turnaround times are common. At BLDSC a system, 
called Automatch, has recently been implemented to 
match incoming electronic requests against a database 
of holdings. When a positive match is made, which 
happens in over 95% of cases for serial requests, the 
batch of requests can be automatically sorted into the 
correct picking order. Of course this system cannot be 
applied to postal requests, which now account for less 
than 40% of total demand on BLDSC. A feasibility 
study is currently examining the use of optical character 
recognition (OCR) technology to convert postal forms 
into machine readable format to allow automated 
processing as with electronically transmitted requests. 
Once the request has been received and validated, 
the required item is retrieved from the shelf and either 
despatched on loan or a photocopy is produced and 
that is despatched. Much interest is now being expressed 
in the electronic storage of the source material. The 
advantages of this are obvious to the user, whether an 
end-user ora supply centre. One example is the Adonis 
system which stores the full text, in image format, of 
about 500 biomedical journals on CD-ROM. This 
system is inplace at BLDSC anda link to the Automatch 
system which is being extended to the article level is 
being planned. In this way individual requests can be 
directed to the Adonis workstation and printed without 
human intervention. 


1.1.5 Delivery 

Some of the ways that technology can be employed to 
speed up some of the processes in the chain of document 
supply have been outlined above. But the most dramatic 
improvements, certainly as far as speed of supply are 
concerned, can be obtained from the use of electronic 
document delivery. It can take several days for a 
document to be despatched via mail services; this can 
extend to weeks in the case of international postal 
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services. Most of the remainder of this article considers 
the use and benefits of electronic document delivery in 
more depth. 


1.1.6 Billing 

The last part of the chain in the process of document 
supply is payment mechanisms. In some systems, such 
as that offered by CARL Systems, payment is made at 
the point of making the request. This has obvious 
benefits for the supplier who receives payment in 


` advance, but the user may not be so keen to part with 


his/her money until the item has been received. (Of 
course, BLDSC has always worked on a method of 
advance payment — but with a refund system for items 
not supplied.) Many systems of payment are being 
investigated by various suppliers, often in the hope of 
gaining a competitive advantage. The most common is 
credit card payment, but other systems such as 
proprietary debiting schemes are also available. 


2. Benefits of Electronic Document Delivery 
2.1 Speed 
Electronic document delivery can be performed at a rate 
ofa few seconds per page — in fact, so fastthat time zone 
differences around the world can become a real problem. 
However, in order to achieve these times, special 
telecommunications networks are required. Using the 
common facsimile standard, the International Telegraph 
and Telephone Consultative Committee (CCITT) Group 
З Recommendation, it takes about one minute to transmit 
a typical page from a scientific or medical journal. An 
average article is about 10 pages, so it can take in excess 
of 10 minutes to transmitan article. This is a tremendous 
improvement on traditional methods of transmission 
which could take as much as 10 days. However, there 
isa price to pay. Literally in the case of group 3 facsimile 
transmission — a 10 minute telephone call can be 
expensive, about £2.00 in the UK at peak times and as 
much as £10.00 if the transmission is from the UK to 
Japan. This compares with the cost of posting a photo- 
copy of about £0.25 and £0.50 respectively. Several 
studies of user needs of document supply have asked 
questions about the balance between speed of supply 
and cost. With some notable exceptions, most users 
want a fast delivery but are not prepared to pay for it. 
It may be worthwhile to examine the reasons why 
a typical page from a scientific journal takes so long to 
transmit by facsimile transmission when the same 
amount of data can be transmitted in a few seconds 
from a computer terminal. The reason is that whilst a 
typical page may contain about 5,000 characters, which 
equates to 5 kbytes in text encoded format, in facsimile 
encoded format that same page will be 40 or 50 kbytes. 
There are two reasons why facsimile encoding is used: 
(1) it is convenient — there are now over 10 million 
facsimile machines worldwide and (ii) it can cope with 
any type of impression on a printed page, even colour 
on suitable machines. There are other methods of 
encoding pages but the versatility of facsimile encoding 
overcomes its drawbacks. 
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2.2 Direct to requester 

A further advantage of electronic document delivery is 
the ability to deliver documents directly to the requester. 
Inaconventional system an item is normally requested 
by means of an intermediary. This intermediary is 
usually a member of staff within the library of the 
requesting organization. Documents are delivered to 
the library. It is then necessary to inform the requester 
that the required document has arrived or use the 
internal mail system to deliver the document to the 
reader. This process can often take over 24 hours. 
Electronic document delivery offers the possibility of 
delivering documents directly to the workstation of the 
requester. 

The problems of image encoding are again apparent. 
The workstation of the user must be able to process 
image files. This normally means the ability to 
decompress and print such files. Although many current 
personal computers are well able to carry out these 
tasks, most do not possess the required software and/or 
hardware. Much of this software and hardware is still 
proprietary. 

There are other problems to be overcome. One of 
these concerns the network address of the receiver. 
Some academic networks may have thousands of users 
attached to them. There are two possible solutions to 
the problem. An intermediary in the requesting 
organization can forward the document to the requester. 
This is a manual process but the process can be 
automated. One way of achieving this is to use the OSI 
X.400 Message Handling System (MHS) protocols. 
The X.400 address can be sent to the supplying 
organization with the request for the document, or else 
the protocol machine in the receiving organization can 
keep a record of the transaction. The supplier can 
either address the document directly to the end user or 
else the protocol machine can automatically forward 
the document. 


3. Reasons for Upsurge of Interest 
3.1 Financial Pressures 
In common with most other institutions, libraries are 
facing increased financial pressure. The situation for 
libraries is much worse than that faced by many other 
institutions because the price of one of their staple 
supplies, viz serial subscriptions is rising much faster 
than the standard rate of inflation. Asa result, libraries 
have little alternative but to cancel journal subscriptions. 
(There is evidence that this may lead to a vicious spiral 
of ever increasing prices but this is outside the scope of 
the present article.) In order to provide the same level 
of service to their users, libraries must look for a 
substitute. Libraries will only cancel journal 
subscriptions provided that an alternative source of 
supply can be guaranteed. Ideally, this alternative 
source should provide all the benefits of subscribing 
without the cost. This includes the ability to browse the 
journal, especially the contents page of a journal. 
Initially libraries began to look at resource sharing 
schemes but very few of these have so far developed to 
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any great extent. At present there is more evidence that 
libraries will rely upon some form of contents listing 
service coupled with a document supply service. 
Several of these are now available and noted at the end 
of section 2.1.1. One essential feature of such services 
is a fast and reliable document delivery service. In 
order to provide this, as well as having a guaranteed 
source of the documents, it is necessary to have a fast 
method of transmission. Electronic document delivery 
is the only means of providing this. More and more, 
libraries are looking to electronic document delivery 
to provide a ‘just in time’ instead of a ‘just in case’ 
service. 


3.2 Technology 

Libraries have been helped in this change by the 
introduction of new technologies. Itis only very recently 
that technology has provided the possibility of fast, 
high quality document delivery. This is due not only to 
improvements in telecommunication networks and 
mass storage devices, but also in the vast increase in 
computing power in desk top computers. This allows 
the relatively large image files to be transmitted, stored 
and most importantly to be manipulated, including 
printed, quickly and at fairly low cost. 

Transmission rates of 64 kbit/sec are now common 
place in telecommunication networks and 2 Mbit/sec 
is now available on many academic networks. Even 
higher transmission speeds are available or being 
planned in several countries. An example is the 
SuperJANET network in the UK. This is planned 
initially to connect some sites at 10 Mbit/sec and 
others at 34 Mbit/sec. The network is capable of much 
higher bandwidths — up to 600 Mbit/sec. 


4. Trials at BLDSC 

BLDSC has experimented with various forms of 
electronic document delivery for over 10 years. During 
this time many problems have been presented. Some 
have been resolved, others have not. Other than the 
now ubiquitous group 3 facsimile machine, there is no 
widespread use of electronic document systems 
anywhere in the world. Many attempts have been 
made to overcome the problems. Below is a brief 
history of BLDSC’s experiences. 


4.1 Group 3 facsimile 

The first attempt at achieving electronic document 
delivery was with CCITT group 3 facsimile 
transmission. The first trials took place in 1982 between 
BLDSC and Chalmers University in Sweden. In a 
period of 3 months over 1,000 pages were transmitted. 
The average transmission time was just less than two 
minutes per page and about 25% of pages had to be 
retransmitted, mainly because of poor copy quality. 
(The first generation of group 3 facsimile machines 
were more primitive, and much larger, than present 
day versions.) As a result of this trial, BLDSC began to 
offer facsimile transmission as part of its Urgent Action 
Service. Although not adequate for everyday use, 
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group 3 transmission is adequate for articles which are 
required urgently. Improvements in equipment and 
the telecommunication networks now mean that the 
average page is transmitted in about one minute and 
the number of pages requiring retransmission has fallen 
to less than 10%. A fundamental problem of group 3 
transmission is that the quality of the received page, 
which is based on the resolution of scanning, is not 
high enough for much of the content of scientific, 
technical and medical (STM) articles. Many of these 
contain sub- and superscripts in formulae and minute 
detail in diagrams and photographs. Group 3 
transmission is incapable of capturing this fine detail. 


4.2 Group 4 facsimile/ISDN 

Inan attempt to overcome this problem trials began in 
1986 using the CCITT group 4 facsimile protocol. 
Group 4 transmission overcomes many of the 
drawbacks of group 3. Itis capable of higher resolution, 
up to 400 lines per inch as opposed to 200 lines per 
inch for group 3, it has a more efficient compression 
algorithm and it is transmitted at speeds of up to 64 
kbit/sec. This means that a whole article, of about 10 
pages, can be transmitted in the same time as a single 
page using group 3 transmission. 

One major drawback of group 4 machines is that 
they require the use of a digital communications 
network. These were being developed as private circuits 
during the 1980s, but the first public switched service 
was launched by British Telecom (BT) in 1985 and 
called Integrated Digital Access (IDA). This was a 
precursor to the Integrated Services Digital Network 
(ISDN). Although BLDSC carried out what is probably 
the first ever group 4 transmission for document 
delivery in September 1986 to University College, 
London (UCL), ISDN did not develop as quickly as 
had been originally thought. It was three years before 
any further trials took place. Even today, more than 6 
years after the first transmission, very few customers 
of BLDSC are using group 4 transmission. The reason 
is the slow development of ISDN. 


4.3 Satellite Transmission 

In an attempt to overcome the problems of the slow 
implementation of ISDN, BLDSC explored the use of 
satellite transmission. This also offers digital 
transmission and has the added benefit that a receive 
dish aerial can be placed almost anywhere (within 
reason) in the footprint of the satellite. BLDSC became 
involved in the Apollo Project. This was a joint 
European Commission/European Space Agency 
project which was designed to overcome the main 
drawback of satellite transmission, namely that it is 
expensive to use. 

Apollo was an attempt to share the use of a single 
satellite channel between information providers who 
had a common group of information receivers. 
Although it worked on an experimental basis, the 
Apollo projectstopped when the European Coramission 
withdrew funding. 
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BLDSC carried out a further trial of satellite trans- 
mission in 1991. In the intervening 5 years since the 
Apollo project, several improvements had taken place 
insatellite communication technology. (Many ofthese 
were connected with the increasing use of satellite 
communication for Direct to Home (DTH) television.) 
One aspect of this was the use of receive-only very 
small aperture terminals (VSAT) technology. An 
experiment, using a VSAT system supplied by British 
Aerospace Communications, took place for 3 months 
in 1991. This was with three industrial customers of 
BLDSC all based in the UK. This trial had several 
innovative features, the most significant being that it 
used the X.400 Message Handling System (MHS) as 
the method of addressing. Technically, the trial was a 
success but at the service level it was very slow. 


4.4 Networks 

One feature ofthe delivery mechanisms outlined above 
is that they are point-to-point methods of com- 
munication. For example, in the case of facsimile 
transmission there must be a machine, or at least a 
facsimile card in a computer, at both the transmitting 
and receiving end. The location of the machine is 
limited to the connection to a telephone line. This 
means that it cannot easily realize the benefit outlined 
above of delivering documents directly to the end- 
user. To achieve this it is necessary to be able to 
transmit documents into the network of the end user. 
Many industrial organizations have such networks but 


access to them is often limited for commercial reasons. 


Academic networks are more prevalent and have the 
added benefit of being much more readily accessible 
and also are interconnected. 

The problem ofthe high volume of data to be trans- 
mitted per article still remains. Many networks are in the 
process of increasing the available bandwidth but image 
transmission over such networks is still in its infancy. 


5. Standards 

A further reason for the lack of developments in 
electronic document delivery concerns standards. It is 
true that the CCITT Group 3 facsimile recommendation 
is now well established, There is some speculation 
about the stability of the CCITT Group 4 facsimile 
recommendation. The problem is even less clear in the 
area of network transmission of images. There are 
several factors to take into account. These fall into two 
categories; those concerned with the image file and 
those concerned with the transmission of that file. 


5.1 Images 4 
For the image it is necessary to specify the following as 
a minimum: 
ө the resolution of the scanned image 
ә the compression algorithm, ifany, which is to be 
employed 
e the file format that is to be used 
e some form of description or identification of the 
image. 
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Most library uses of image storage and transmission 
use a minimum resolution of 300 lines per inch — some 
applications especially where preservation is involved 
use higher resolutions. The compression algorithm 
which has most widespread use is probably the CCITT 
Group 4 algorithm. With a typical page of text this can 
achieve a compression ratio of about 15-20 times. This 
is reduced for pages which contain non-text material to 
less than 10 times (in extreme cases, for instance pages 
containing a high proportion of photographs, no 
compression of data is possible.) More efficient 
algorithms, such as the JPEG algorithm, or techniques, 
such as fractal compression, are becoming available. 

It is common for some file formats to include all 
the additional information as header information in the 
file. One example of this, now established as a de facto 
standard, is the Tag Image File Format (TIFF). This 
gives details, in a header, of all the information above 
plus much more. The major problem with TIFF is that 
it allows many variations of the header information to 
such an extent that there can be incompatibilities 
between different versions. 

There are two recommendations (not standards) 
for image file formats and their transmission. One has 
been produced by the Internet Engineering Task Force 
(ПЕТЕ) -RFC 1314. As might be imagined this deals 
only with image transmission over the Internet and 
deals with any type of image. The other recom- 
mendation is more specific and is aimed at document 
delivery. It was developed by the Group on Electronic 
Document Interchange (GEDI). GEDI is a group of 
libraries and library utilities from France, Germany, 
the Netherlands, the United Kingdom and the United 
States of America. Both recommendations are very 
similarin their use of TIFF as the file header, but GEDI 
deals in more detail with the description of the file. It 
is possible to identify the requester, the supplier, 
bibliographic and transaction details from the GEDI 
header and not the TIFF header. 


5.2 Transmission 
It should be theoretically possible to use any file trans- 
mission protocol to transmit an image file. In practice, 
because of the large volume of data, itis much preferred 
to use a protocol which has some form of error correc- 
tion. (With bit mapped images, especially if some 
form of compression algorithm is used, one bit in error 
can cause a whole page to be rendered unintelligible.) 
IfOSI-based protocols are to be used, there are two 
possibilities, either the X.400 MHS protocol or else 
the file transfer protocol, FTAM, can be used. The 
former has the advantage of allowing delivery directly 
to end users with network access. The latter has the 
advantage of providing a more secure method of 
transmission. If X.400 is used then it must be the 1988, 
or later, version (ISO 10021) This is because some of 
the header information is held in ASCII format and 
thus requires different types of data, or body types, to 
be present in the same transmission. FTAM overcomes 
this problem by considering the different types of data 
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as different records within the file. The GEDI 
Recommendation specifies the use of FTAM as the 
transmission protocol. 

Of course, not all networks are OSI-based. The 
prime example being the Internet in the USA, with 
global connections. This uses the TCP/IP-based 
protocols. The equivalent to X.400, the Simple Mail 
Transfer Protocol (SMTP) cannot cope with image 
files and cannot be used. However, the File Transfer 
Protocol (FTP) can, and is, used for image transmission 
over the Internet. To overcome the problem of differing 
transmission protocols, many networks are becoming 
multi-protocol. 


6. BLDSC Plans for Future Services 
BLDSCisnow actively planning trial and pilot services 
in several areas. At present all these trials assume that 
articles will be scanned on demand. This will be done 
on a flat bed scanner in much the same way that a 
photocopy is produced. Initially there will be no attempt 
to store the image for future use. This may happen at 
some future date. 

BLDSC is experimenting with the storage of 
images. This is being carried out in two experiments. 
The first is through the use of the Adonis system. The 
second isa small scale demonstrator project for in-house 
scanning and storage of the full contents of journals on 
receipt. It is possible that in the future both these 
systems will be linked with electronic delivery systems. 


6.1 JANET/SuperJANET 

Some time ago BLDSC began discussions with the 
Joint Network Team (JNT) about the possibility of 
using JANET for document delivery. It was agreed 
that X.400 (88) would be used in combination with the 
GEDI recommendations on image format. This would 
allow the transmission of documents to end users, 
although this would not be implemented initially 
because of copyright considerations. It was also 
intended that the system should be as cheap as possible, 
and yet be robust and reliable. As a result work began 
in implementing an MS-DOS based system. Although 
this proved workable, it seemed that it would be very 
slow. The development was upgraded to a Unix-based 
system, but still used a personal computer. 

Several problems have been encountered in the 
implementation and at the time of writing, March 
1993, the system is not yet operational. Technical 
trials are planned within the next two months and a test 
service in Summer 1993. At first only one site will be 
connected. Once it has proved to be technically viable 
another three or four sites will be added in order to test 
the service aspects of the system before making it more 
widely available. 

When the implementation of this system was taking 
place, the announcement of the introduction of < 
SuperJANET was made. A connection between 
BLDSC and SuperJANET is planned for later during 
1993. The full potential of SuperJANET in connection 
with its use for electronic document delivery has still 
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to be explored. As SuperJANET will be a multiprotocol 
network, the possible uses seem very promising. 


6.2 The Internet 

At the same time that discussions were being held with 
the JNT, BLDSC was being approached by several of 
its American customers asking for electronic document 


delivery via the Internet. It was decided to make both : 


implementations as similar as possible. The initial 
problem was that the British Library did not possess a 
broadband link to the Internet. It was decided to make 
use of the 64 kbit/sec link that had been installed to 
connect the image workstation to JANET using the 
FTAM protocol and then convert to FTP using a relay 
on JANET. This was done but it was found to be very 
slow. At the time when the upgrade to a Unix system 
took place a connection to the JANET IP Service (JIPS) 
was installed. This proved to be much faster although 
notidealandother possibilities are under investigation. 
Using the JIPS connection it has proved possible to 
transmit image files on a test basis to users on the 
Internet. It is planned to begin a pilot service very soon. 


6.3 Other Users 

It is recognized that much of the work which is being 
carried out currently is limited to users connected to 
academic networks. It is possible that some 
non-academic users may soon have connections to 
these academic networks. However, the various 
interpretations of acceptable use policies for these 
networks make it difficult to know whether it will be 
possible to use such networks for electronic document 
delivery. It may be possible to find some other type of 
network connection. The position of ISDN now seems 
to be much more stable. It is now widely available in 
the UK and many other countries, although connectivity 
between some countries still remains a problem. It is 
possible for ISDN to support both the X.400 and the 
FTAM transmission protocols and so the services 
described above could be made available to ISDN 
users as well as those users connected to academic 
research networks. 


7, Barriers to Introduction of electronic document 

delivery 
There are many reasons why electronic document 
delivery systems have not yet been implemented. These 
reasons fall into three categories (possibly in ascending 
order of complexity): 

e Technical 

e Copyright 

e Organizational/Human 


7.1 Technical 


Most of this article has concentrated on this aspect and 
it will not be covered further. One final point to make 
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is thatmuch of the technology which is being employed 
is changing rapidly. Systems which are installed today 
may look very outdated in as little as 12 months time. 


7.2 Copyright 

Much has been written already about the issues 
connected with the electronic storage and transmission 
of journal articles. The rights holders of the articles, 
normally the publisher, are concerned about two 
aspects. First, the ease with which articles in electronic 
format can be manipulated and second, the ease with 
which users can obtain copies of articles and hence 
cancel subscriptions. Several trial systems have been 
implemented by publishers to explore these issues, 
such as Adonis, and others are being planned. Most of 
these are designed to study user reactions but at the 
same time offering the publisher some form of 
recompense for the use of articles, A workable solution 
has yet to be found. What is clear is that, unless a 
simple and equitable solution is found, there will be 
widespread abuse and, more importantly, users, who 
are also the authors of the articles, will find ways of 
bypassing the conventional publishing process. 


7.3 Organizational/Human 

The effects of the change from a conventional to an 
automated method of document delivery cannot be 
underestimated. The existing system, based on 
conventional publishing in paper format, is to be 
replaced by a system of electronic publishing with 
networked access. This will mean that, potentially, 
every user will be able to access all the information 
required from a desktop workstation. 

There are many problems to be overcome to achieve 
this. One concerns payment mechanisms. Users are 
not currently used to paying for information as they 
use it. Both suppliers and rights holders will expect 
that payment at point of use will become the norm. 
Much training is required. Users do not have the skills 
to use these advanced systems, neither do those who 
will be responsible for implementing them. As is 
pointed out above, much of the technology to be 
employed is very advanced, librarians and others will 
require a great deal of training in order to ensure its 
successful implementation and use of these systems. 


8. Conclusion 

In order to achieve the change to a totally automated 
system of information delivery, electronic document 
delivery will have a vital role to play. There are many 
challenges to be overcome before such automated 
systems will be installed for everyday use. The one 
certainty is that, one way or another, these problems 
will be resolved and at some time in the future it will be 
the norm and not the exception to be reading articles 
such as this in electronic format. 
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Intreduction 

Virtual Reality (VR) refers to the computer generation 
of realistic three-dimensional artificial worlds in which 
humans, typically equipped with head-mounted 3D 
displays, interactive gloves and even whole-body suits, 
can be ‘immersed’, and are free to explore and interact 
with graphical objects in real time, using such natural 
skills as looking from different angles, moving, 
pointing, grasping, listening and talking. The early 
history behind the emergence of VR is short and 
incredibly intense and characterized by a small group 
of familiar names. As one of the key figures, Myron 
Krueger has described it, ‘...Like particles in a fission 
reaction, personnel from one project disband and 
reappear with new affiliations’. That reaction continues 
today, with a reproduction of the American experience 
in Europe. 

The concept of VR (although originally not referred 
to as such) originated in the early 1960s. Its emergence 
is in part attributed to the work of Ivan Sutherland. His 
first system, called Sketchpad, was a computerized 
design tool which allowed users to design and 
manipulate graphical objects on a screen using a light 
pen. Sketchpad ran on a TX-2, the most powerful 
computer available at that time. It was not only a tool 
for visualization, it was an operating environment in 
its own right. Sutherland’s later work was based on 
research for the Advanced Research Projects Agency 
(known until recently as DARPA —the D is for Defence). 
Using a ceiling-mounted, cantilevered stereoscopic 
display headset boasting a 40° field of view — the 
Sword of Damocles — primitive wire frame graphics 
with hidden line removal could be displayed to the 
wearer in stereo, or 3D. 

However, the physical realization of Sutherland’s 
visions — the VR technology being marketed today — 
has only been under development since the early 
1980s, through research efforts in robotic telepresence 
at NASA Ames in California, in association with VPL 
Inc, and in the development of the ‘Super Cockpit’ at 
the Wright Patterson Air Force Base (WPAFB) in 
Dayton, Ohio. The goal of the Super Cockpit work was 
to develop advanced avionics and cockpit management 
systems to permit the screening of pilots of future 
military aircraft from direct visual contact with the 
outside world, generating instead virtual imagery from 
airborne sensors for presentation using large-screens 
or helmet-mounted displays. The motivation behind 
such work was to protect the pilot’s eyes from damage 
caused by laser weapons and nuclear airburst flash. 
The Super Cockpit Programme apparently no longer 


exists as such, although some related work is continuing 
atBrooks AFB. The original Super Cockpit proponent, 
Tom Furness, left WPAFB in 1989 to establish a major 
research effort in VR at the Human Interface 
Technology Laboratory (University of Washington 
State). Industrial concerns can buy into this ambitious 
programme for the modest outlay of $50,000 per year. 
With its tradition of following such US Initiatives, 
European advanced cockpit research programmes have 
existed for at least 3-4 years, but due to the classified 
nature of the work, only a few details have been 
released to date. 

Certain parts of the NASA Ames VR project have 
survived. One project focuses on space robotics, 
telepresence, planetary exploration and aero/space 
traffic control. The early VIEW System (Virtual 
Environment Workstation) was developed by the 
Aerospace Human Factors Team atthe Ames Research 
Laboratories in California and combined a helmet- 
mounted virtual display unit with a VPL DataGlove 
(described later in this Document) for the control of 
dextrous robotic end effectors. Connected speech 
recognition technology plus a 3D auditory display and 
speech synthesizer (the ‘Convolvotron’) also formed 
part of the VIEW workstation. NASA's emphasis on 
the application of VR has changed over the years, 
moving from telepresence (with the recently-cancelled 
Flight Telerobotic Servicer (FTS) Project as a key 
focus) to planetary exploration — the use of detailed 
virtual terrain models of other planets for visualization 
and supervisory control purposes, and the investigation 
of recovery strategies for astronauts who have become 
separated from the Shuttle or Space Station. Researchers 
at Ames are also heavily involved in assessing and 
modelling the human factors aspects of immersive 
display environments, together with developing 
objective methods of testing commercial VR 
equipment. 

An alternative method of generating Artificial 
Realities has been the subject of much research by 
Myron Krueger, who has been investigating the human- 
computer interaction aspects of artificial realities since 
the late 1960s. In 1969 Krueger became involved ina 
project known as Glowflow, creating illusions of 
perspective and slope by fitting out normal, darkened 
rooms with tubular lighting at various orientations. 
Through other, more interactive projects (Metaplay, 
Psychic and Maze), Krueger’s pioneering research led 
to the development of the now well-known Videoplace 
System. Essentially, in Videoplace the participant 
stands in front of a large video projection screen that 
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displays his live silhouette combined with an artificial 
reality. By backlighting a translucent screen behind 
the participant (or participants, in the case of a 
networked system), body form can be distinguished 
from the background by video image digitizers. A 
special executive computer processor translates the 
digitized ‘behaviour’ of the participant into a form 
which changes the graphical world and its constituent 
objects appropriately. The illusion of interaction is, 
thus, complete. 

Nevertheless, it is without doubt the early NASA 
projects, together with the emergence and early success 
of VPL Inc., that are responsible one way or another 
for bringing VR to the attention of the media and 
general public. For instance, when quoted a sum of 
$1,000,000 for the purchase of a helmet-mounted 
display system from Furness at Wright Patterson, 
Ames’ key VR researcher — Mike McGreevy — paid a 
visit to his local Radio Shack and bought a pair of 
Citizen LCD pocket televisions — the first low-cost 
head-mounted display had been born. 


1. VR on the other ‘side’ of the Atlantic 
Since its introduction to Britain in the early 1990s, VR 
has been portrayed as the ultimate in computer games 
technology. As a result, national awareness of world- 
class British research into serious industrial applications 
has, until recently, suffered from a media preoccupation 
with the hype associated with VR for entertainment 
and leisure. i 
„Stone says VR needs to be carefully controlled. 
“If VR systems ever find their way into the 
home, then that’s the end of civilization as we 
know it,” he warns... But Dr Margaret Schotton 
ofNottingham University disagrees. “Computer 
games аге just games, for God's sake!...”? 
Playing With Reality by Dr Glenn Wilson 
Sunday Mirror Magazine, July 26, 1992 
Апа that’s how it is on the European side of the 
Atlantic at the moment. Just as research engineers 
think the media coverage of Virtual Reality (VR) has 
settled down, along come a few more articles which, as 
if by magic, find their way into the offices of those who 
pay wages and sponsor projects. In the main, the 
European media industry’s coverage of VR has not 


been bad (although the misquotes can sometimes be . 


horrendous). Full marks have to go to France in this 
respect, in the main giving concise, thorough -and 
unbiased coverage of initiatives in Europe and further 
abroad. However, reports elsewhere (particularly the 
UK)onimportant developments and advances in fields 
such as aerospace, robotics and human factors have 
been few and far between, defeated from appearing in 
the press by publishers’ needs to attract readers by 
featuring outrageous claims from psychiatrists, new 
virtual rock bands, tenuous links with medical practices 
(such as gall bladder operations), or the remarkable 
achievements of ‘new’ VR experts, leisure companies 
and so-called studios and demonstration centres. Even 
the Sun newspaper recently published a feature on VR, 


168 





claiming its use by Richard Branson’s Virgin Airline 
as a cure for fear of flying, creating for the petrified 
traveller the illusion of being on a train. Claiming 
innovation through the use of ‘new’ Japanese 
technology, the feature was accompanied by a picture 
of a businessman, slumped on an aircraft chair, 
overwhelmed by the weight of a W Industries Visette 
headset. The article was dated 1 April 1993! 

So, besides its supposedly-comical use, such as 
that in the Sun, or the portrayal in 1992 of hopeful 
candidates in the UK’s General Election, where is the 
technology actually going? Is it going anywhere at all? 
What has industry achieved so far and what does 
academia have to contribute? Will VR remain a crude 
computer game in the eyes of the general public, or 
does it have the potential for a more serious form of 
interactive visualization for European Industry? 

It is not the aim of this article to present a thorough 
technical review of VR in the UK or throughout 
Continental Europe. Reviews of this sort are very 
quickly out-dated. Reference to the increasing number 
of newsletters and journals — VR News, CyberEdge, 
Virtual Reality Systems, Virtual Reality Report, 
Presence, and so on—not to mention the new technical 
books published or scheduled for publication in 1993 
(eg. Pimental & Teixeira, 1993; Earnshaw etal., 1993), 
will provide more than enough up-to-date information 
on world-wide developments (refer to the general 
reading list at the end of this paper). Rather, the aim of 
this article is to consider what has happened to VR in 
Europe over the past 24 months or so. The ‘honeymoon’ 
period of short-term international prestige through 
innovation in VR is now considered to be over. There 
must now be a call for genuinely new and well 
coordinated initiatives to be put in place to help avoid 
the historical precedent of losing out to non-European 
research, development and marketing competition, the 
pace of which (but not yet, fortunately, the quality) is 
slowly, but surely, accelerating. 

Real reality in European Research & Development 
is pretty grim at the moment. Across the Continent, the 
politicians and grant-givers seem to have charted a 
course which could well stifle, rather than support 
many attempts to achieve technical excellence in 
industrial and academic research and development 
fields of endeavour. With a few exceptions, the 
collaborative and partly-funded European technology 
initiatives evident to date in areas related to VR have ' 
not been outstanding. The apparent goodwill in 
fostering collaborative projects between countries is 
all-too-often superficial, disguising the fact that each 
nation will 'do its own thing', in the vain hope that it 
will all come together at the end of the project. 
Furthermore, many such projects achieve little 
more than boosting the R&D budgets of large aero, 
space and defence companies who cunningly disguise 
new military products or subsystems they wish to 
develop and commercialize as advanced technologies 
for the ‘general good’ of European Industry and the 
European People. 
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So does Virtual Reality fit into a similar overall 
European scene at the moment? Although one cannot 
expecta single publication to contain a comprehensive 
listing of all companies and academic bodies involved 
in VR throughout Europe, the 1993 (first) edition of 
Virtual Reality Market Place (Helsel & Doherty, 1992) 
lists a total of 12 ‘centres’ of VR effort: England hosts 
7 organizations, Scotland 2, whilst France, Germany 
and Sweden have 1 each. Of course, these are 
organizations with a product or service to offer to the 
VR community. Bodies involved with strictly basic 
research now number far more than this US publication 
leads one to believe. Yet the replication of effort in VR 
throughout Europe is considerable. ‘Reinvention of 
the wheel’, as far as VR for telepresence and robotics 
applications, sensor-based and computer-aided design 
(CAD) data conversion is concerned, with those 
responsible paying only lip service to key efforts 
elsewhere in the Continent over the past 4 or so years, 
is, sadly, commonplace. 

Onsome levels, European VR, from the standpoint 
of industrial acceptance and in the actual progress 
made, is suffering, although one or two large national 
companies have demonstrated sufficient foresight to 
invest in the technology for research and real 
applications purposes. One of the more interesting 
features of the British ‘mentality’ towards VR is a 
craving to be first on the scene with a new product, 
studio, distributorship agreement or service. It has 
- become very evident to the few unbiased groups of 
researchers in Europe (ie. those with no specific 
commercial VR allegiance) that many so-called 
‘specialist? VR companies are actively engaged on 
smear campaigns, particularly in the States. For 
example, it was discovered at Meckler’s VR ‘92 
Conference in San Jose that certain UK companies had 
become more preoccupied in 1992 with securing 
licensed distributorships for US hardware and software 
products than they had been in ensuring the quality of 
their own products. Some have even approached US 
companies in an attempt to persuade them to transfer 
existing licences from European distributors who, they 
claim, are ‘not performing well’. 

Whilst touching on the subject of conferences, the 
number of VR events being staged throughout Europe 
has also been increasing exponentially over the past 18 
months. Sadly, many of these lack the technical quality 
of their American counterparts and all-too-often turn 
out to be a ‘shop front’ for VR Companies desperate to 
sell their products and services (this is particularly true 
of the UK). Others are organized by national and 
international bodies who refuse to engage the services 
of well-known researchers and establishments in the 
selection of appropriate papers and demonstrations. 
Consequently, many articles with the most tenuous of 
links with VR get accepted and audiences come away 
less-than-satisfied. There are, of course, exceptions. 
For example, Jmagina, held annually in Monte Carlo 
shows much potential in becoming Europe’s answer to 
the US Siggraph Event. Other related functions such as 
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MICAD (held in Paris) and MIMAD (held in Spain) 
do much to put VR in its place as far as serious 
industrial visualization applications are concerned. 


2. VR in the United Kingdom 

Currently, the UK still has the prestige of being the 
leading country in Europe, and in some respects the 
World, for VR developments — particularly in the 
fields of distributed parallel processing, peripheral 
design (including tactile feedback) and applications. 
How long this continues remains to be seen. Public 
imagination has certainly been captured in the UK, 
and Eire, for that matter. Even the BBC has recently 
‘gone virtual’, with a sophisticated graphics sequence 
opening its main news programmes, surrounding a 
solitary newscaster (see also the coverage in this paper 
of Advanced Robotics Research Limited’s endeavours). 

In addition to this popular ‘front’, every year the 
Institute of Electrical Engineers (IEE) arranges a new 
lecture series designed primarily for their Younger 
Members Section. The Silvanus P. Thompson Lecture 
for 1992-1994 is on the subject of VR (given by the 
author). Many sites across the UK and Eire host the 
lecture and, to date, the attendance record has been 
nothing short of astounding. People from all walks of 
life, of all backgrounds and ages attend the lectures. 
Audiences have averaged 200; on occasions two 
lectures have been presented in one evening. Even 
schoolchildren of Armagh in Northern Ireland have 
chartered coaches to reach the Queens College, Belfast 
Lecture. Such is the attraction of the subject at the 
moment. 

At the end of 1991, there were five main centres of 
VR activity in the United Kingdom — W Industries, 
British Aerospace (Military Aircraft, Brough), 
Dimension International, Division Limited and the 
Advanced Robotics Research Centre in Salford. This 
was still the case at the end of 1992. There are also 
many ‘unsung’ heroes and inventors busily developing 
systems which may well find their way into VR use in 
the not-too-distant future. Like Jim Hennequin in : 


‘Cranfield, pioneering the use of his proportional 


pneumatic systems for sensory feedback and low-cost 
motion platforms (see also the coverage of the work of 
the UK’s Advanced Robotics Research Centre below). 
Like the small family company in Leicester (not W 
Industries) who have produced a low-cost motion base 


for less than £15,000. Or the Company TCAS Effects 


(Twentyfirst Century Actuators and Sensors) Limited 
in.South Wales, responsible for developing a British 
version ofan instrumented suit and glove (albeit initially 
for animatronic applications). The TCAS Sensor Suit 
is based around a patented lightweight sensor referred 
to as the TCAS Flexible Linear Sensor (a conductive 
elastomer). ARRL has been assisting TCAS in their 
launch of the glove by designing the interface 
electronics and driver software for VR applications. 
It is unfortunate that some others have had to leave 
the UK in order to find development funding elsewhere 
in the world. A report in VR News (December 1992) 


169 


Virtual reality: toys or tools of the trade? 





made mention of Goggle Vox — a breakthrough in 
head-mounted displays brought about by a diffusion 
film called Microsharp, developed by the Goggle 
Vox’s inventor Willie Johnson and Loughborough 
University and currently the subject of a patent 
application. Goggle Vox is not new, as VR News 
implied. About 2 years before the resurrection of the 
concept, Goggle Vox was featured in a UK National 
Newspaper and on the British TV Science Programme 
Tomorrow’s World. Interested parties attempting to 
contact Johnson at his Chiswick home were informed 
that little information on the technology could be 
released due to an apparent exclusivity arrangement 
with W Industries. Shortly after initial contact, follow- 
up calls were thwarted by a ‘number unobtainable’ 
tone. Johnson’s recent reappearance in the USA was 
accompanied by smal] entries in the UK Scientific 
Press claiming lack of development funds forced his 
departure from the UK. One can only guess at how 
often this state of affairs actually results in the UK, and 
other European Countries for that matter, losing other 
valuable inventions and personalities. 


2.1 W Industries 

W Industries of Leicester — a familiar name across the 
international VR scene — has been developing a range 
of ‘Virtuality’ products, primarily for the leisure and 
entertainment market. Although the basic technology 
used in the W Industries’ products is similar in many 
respects to that being utilized in research establishments, 
the Company and its capital investors have expended 
considerable sums of money in packaging and 
marketing the technology in a way which has secured 
a lucrative early market (in excess of a reported $11 
million in 1992). Many of these systems have found 
their way into arcades and ‘simulation centres’, not to 
mention being used for competition prizes in the UK’s 
Daily Mirror, and to advertise (by offering virtual 
hang glider flights) such products as Hero —a new skin 
scent for men — at shopping centres up and down the 
UK. The Company has recently launched its Virtuality 
1000CS Product — a stand-up virtual game system 
which can be networked to provide multi-participant 
involvement. Four such units have been linked in a 
‘Location Based Entertainment’ (LBE) set-up in 
Nottingham (more LBEs are likely to emerge across 
the UK in the coming year or so). The Nottingham 
Legend Quest Centre, ownedby Virtual Reality Design 
& Leisure, came ‘online’ at the end of January, 1992 
and reportedly broke even 6 weeks thereafter. The 
LBE concept has the full support of the W Industries 
Team, and the Company has produced a thorough 
*how-to-start-your-own-VR-arcade' brochure which 
guides the enthusiastic manager-to-be, or investor, a 
detailed account, with specimen cash flow figures, of 
how to attract sponsors or investors. The brochure 
even suggests the possibility of offering local corporate 
membership for companies interested in providing the 
ultimate in hospitality. Alternatively individual clubs 
and public houses can approach independent UK 
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companies to hire the W Systems for anything between 
£2,000 and £3,000 per night. 

‘It will blow your mind!’ is the claim of the W 
Industries brochure, although, as can already be seen 
in one or two of the London LBEs, such centres could 
startto become the meeting point ofa new wave of VR 
enthusiasts. Groups of veteran and fledgling immersees 
are already beginning to form themselves into a 
movement which, quite soon, could rival train spotting 
enthusiasts in terms of the extent of the obsession and 
the numbers involved! An excellent article called Did 
I Get Out of my Brain Last Night? featured in a July 
1992 edition of The Times Saturday Review. Journalist 
Joe Joseph, having spent an evening at one such 
centre, summed up the situation quite well: 

‘_..Cyberpunks, as some of these people like to 

call themselves, already have such tenuous 

links with the real world that a couple of 

paracetamol would be enough to tip many of 

them into dreamland...’ 
W Industries has being attempting to open out its 
product range and architectures to more serious 
applications, such as engineering design and 
visualization. Despite strong rumours to the effect that 
W Industries was to launch a new lightweight version 
of the Visette head-mounted display and a more 
powerful VR graphics engine for the serious 
applications ‘market’ in November of 1992, nothing 
materialized and a new launch date is expected some 
time in 1993. 


2.2 British Aerospace 

Although many divisions within British Aerospace 
have a token involvement in VR, the most coordinated 
and impressive initiative is based at Brough in North 
Humberside. Led by Professor Roy Kalawsky (the 
UK's first Professor of VR, holding a visiting Chair at 
Hull University), the Brough Division has been 
spearheading UK efforts in advanced fighter aircraft 
cockpit design using VR technologies. One of BAe’s 
systems, RAVE (Real and Virtual Environment) has 
been specifically developed for cockpit training and 
was demonstrated for the first time at the 1992 
Farnborough Air Show. The RAVE development is 
part ofa larger BAe project known as VECTA (Virtual 
Environment Configurable Training Aid), a sophisti- 
cated test bed for investigating VR technologies and 
their potential to replace conventional simulator 
methods. Professor Kalawsky, one of the UK’s few 
proponents of the importance of considering the human 
factor in VR applications, has also been working ona 
new book on the subject of VR, due for publication in 
April of 1993, and one of the first to present under a 
single cover the serious scientific aspects of the field, 
rather than the esoteric (and none-too-informative) 
publications evident to date. 


2.3 Dimension International 


Dimension International of Aldermaston in the South 
of England has been pioneering the concept of 
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‘Desktop’ VR. Despite the fact that Desktop VR has to 
date been ‘non-immersive’ (ie. headsets are not used), 
it appears to have found much favour with many 
academic and industrial users who prefer the image 
quality and real-time interactive nature of the 
Company’s 486 PC-based systems over the lower 
visual and dynamic qualities of their immersive 
competitors. The Company has produced a range of 
commercial VR software development packages under 
the product title of ‘Superscape’ (starting with a basic 
VR Toolkit package, expandable with such options as 
networking, applications programmer's interface, and 
so on, up to the full professional Superscape System). 
The graphical output is handled by a SPEA graphics 
card coupled to a high definition monitor. The main 
input devices to the system are a Spaceball controller 
anda standard mouse. The front end ofthe Superscape 
Package consists of three main environments, a Shape 
Editor, a World Editor and a Visualiser. In the Shape 
Editor, objects are created in 3D. In the World Editor, 
pre-defined shapes are placed in their world coordinates. 
Dimension's Visualiser allows the userto move around 
and interact with the virtual world, taking commands 
from the Spaceball or mouse. Dimension have recently 
announced new developments in sound simulation and 
limited texturing, and there are plans to ‘go’ immersive 
during 1993, by linking two Superscape PC Platforms. 


2.3.1 Cyberzone: VR Comes to Television 
Dimension International has also been working with 
the TV Company Broadsword on Cyberzone ~ claimed 
to be the world's first VR TV Game Show. Although, 
as one might expect, the Game is non-immersive, it is 
interactive, with two competing teams exploring and 
interacting with a common virtual world. One member 
of one of the teams is presented with images of a 
virtual world, projected using a 3 x 4 matrix of TV 
screens, enveloping his or her visual field of view. The 
other team member instructs the virtual ‘explorer’, by 
means ofa map display or another virtual representation, 
this time of the explorer ‘in situ’. Movement through 
the virtual world is achieved either by means of pressure 
pads which sense the user’s gait, thus allowing him or 
her to walk or run on the spot (a system which may 
well find applications in interactive CAD). The 
opposing team members sit in an enclosed module 
and, by driving a virtual ‘bulldozer’ around the virtual 
world, attempt to thwart the attempts of their adversaries 
by destroying key links in their route to success. 
Although pilot episodes were made in 1991, filming of 
a more polished Cyberzone started at the Granada TV 
Studios in Manchester in October of 1992, The first 
screening of the Programme on British Television 
(BBC2) occurred on Monday, January 4 at 6:50 pm, 
featuring football rivals John Fashanu (Wimbledon) 
and John Barnes (Liverpool)! 


2.3.2 Desktop VR and Education 


The first and most widely publicized educational VR 
project to be based on Dimension International 
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technology was established in the UK — under contract 
—atthe West Denton High School in Newcastle-Upon- 
Tyne. Under the direction of The School's ex- 
Headmaster, Michael Clark, sixth-formers helped to 
win a £100,000 contract from local industries, notably 
NEI Parsons (part ofthe Rolls Royce Group), to set up 
an ‘Intelligent City’ Project. Using the Dimension VR 
Toolkit (initially), the School is exploring ways of 
using VR to teach foreign languages. Other applications 
ofthe West Denton School's Initiative include industrial 
safety training and architectural issues. Recently, 
teachers from West Denton who were involved in 
Clark's Initiative *absconded' to set up a new VR and 
visualization company called Real Time Design. One 
of their first projects involved using Superscape to 
design river-front architectures in Newcastle. 

More recent UK developments in the educational 
and academic field include the University of 
Nottingham's VIRART (Virtual Reality Applications 
Research Team) Project, the aim of which is to explore 
alternative input devices for Desktop VR. In particular 
the VIRART Team will examine applications for 
alternative input devices, and to identify new 
possibilities as well as ergonomic aspects and 
technological limitations of existing interfacing devices. 
For example, one goal of VIRART is to take an 
existing instrumented glove and body suit system and 
modify it for use in conjunction with the Dimension 
VRT. This will then be used to control and navigate an 
existing man model in a virtual environment, or to 
simulate physical human movements in a virtual 
prototype design, or to manipulate the virtual 
environment by natural gestures. Another goal is to 
develop a new inexpensive glove system capable of 
recording all joint angles in the hand so that finger and 
palm movements can be realistically simulated by the 
man model in the Desktop VR system. 

The VIRART Team has also been working closely 
with the nearby Shepherd School to introduce Desktop 
VR technology into the educational lives of children 
with learning difficulties and severe physical handicap. 
Incorporating examples from the Makaton Symbol 
and Signing System into the virtual worlds, the first 
aim of the project has been to assess the benefits of 
using VR to help children to master the basics of a 
vocabulary through the association of hand signs and 
symbols with interactive 3D objects. 


2.4 Division Limited 

Founded in 1989, Division Limited of Bristol has been 
pioneering the use of transputer and 1860 technology 
in their development of the Vision, ProVision and 
SuperVision Systems — modular and high speed 
graphics engines which avoid the processing 
bottlenecks associated with previous approaches to 
generating and interacting with virtual worlds (such as 
the original VPL RB2 architecture, which consisted of 
two Silicon Graphics Iris Workstations and an Apple 
Macintosh). Division’s first Vision System, which 
supported both stereo virtual graphics and video, was 
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delivered to the UK’s Advanced Robotics Research 
Centre (see below) in November of 1990. An upgraded 
system, and Division’s first Super Vision Engine (based 
on i860 rendering technology) was delivered to the 
Robotics Centre in April, 1992. 

The Company's software is broadly designed within 
the architectural paradigm of ‘actors’ under the control 
of a ‘director’ (a.k.a. a ‘client-server’ architecture). In 
such systems, the software is arranged as a single 
server process which acts like an intelligent telephone 
exchange. The client processes each manage their own 
quite specific parts of the VR simulation. Each client 
loads and maintains its own objects. When a client 
decides an interaction has occurred, it sends a message 
containing the type of interaction (eg. object collision), 
and interaction type-specific information (eg. the 2 
identifiers of the colliding objects) to the server. The 
server then redistributes these interactions to every 
other client that has ‘registered interest’ in that type of 
interaction. 

Division acts as a UK Distributor of commercial 
VR products developed by VPL (until recently, of 
course), and other US concerns, and has recently 
teamed with IBM in the development ofa VR Operating 
System, UniVRS, for the RISC System 6000 
Workstation. The Company has also ported its operating 
system onto Silicon Graphics hardware (such as 
RealityEngine and Onyx). Division has also recently 
set up a sister company, Division Inc., on the West 
Coast of the USA, merging with Fake Space Labs and 
Crystal River Engineering (the developer of the 
Convolvotron 3D Audio System). The Vision range of 
products is also distributed in the Far East by Matsushita 
Electric Industries. As a Company, Division has been 
steadily expanding over the past 12-18 months and has 
recently acquired newer and larger premises, moving 
from the quaint British Town of Chipping Sodbury to 
a location just north of Bristol, at the intersection of 
two of the UK’s busiest motorways. 


2.4.1 Division and the London Parallel Applications 
Centre 
With its considerable resource in the field of parallel 
processing, Division has, amongst projects considering 
the use of VR for simulation, molecular modelling and 
telepresence, not surprisingly been involved in providing 
equipment and services to the London Parallel Applica- 
tions Centre (LPAC). LPAC was established in January 
of 1992 asa ‘centre of excellence’ in parallel processing, 
under a joint initiative between industry, academia, 
and the Department of Trade and Industry. LPAC consists 
of three London University colleges — Queen Mary 
and Westfield, Imperial College of Science, Tech- 
nology and Medicine, and University College London 
— plus the City University. The other industrial partner 
is the Central Research Laboratories of Thorn EMI. 
Virtual buildings and rooms are being created 
using Division Hardware, in which researchers, 
designers and users can experience and modify a range 
of interiors. Building layout and content will be 
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dynamically adaptable, in particular the selection, 
placement and intensity of light fittings and the 
positioning and specification of other light sources. 
The basis of the lighting simulation will be ‘parallel 
adaptive radiosity', an innovative technique which 
offers the prospect of greatly improved realism in the 
rendering of lighting effects, and which will be further 
developed during the course of the project. 


2.5 UK Advanced Robotics Research Centre 

In part stimulated by the rapid uptake of robotics 
technology by Japan, giving the Country a perceived 
competitive advantage, the identification of advanced 
robotics as an important technology in promoting 
economic growth was one of the key issues discussed 
at a 1982 meeting of the Organization for Economic 
Cooperation and Development (OECD) in Versailles. 
One of the surviving initiatives to emerge from the 
Versailles Meeting is that ofa collaborative programme 
in the field of Advanced Robotics, known later as 
IARP — the International Advanced Robotics 
Programme. At the time of the OECD Meeting, 
advanced robotics research efforts in the UK were 
mainly confined to academic institutions, largely 
funded by the Science and Engineering Research 
Council, and to research facilities within defence 
establishments and companies supporting the nuclear 
power industry. Until the launch of the Department of 
Trade & Industry's (DTI's) Advanced Robotics (AR) 
Initiative, there was no national focus to these efforts. 


2.5.1 The Background 
The AR Initiative took the form of three components. 
One of these components concerned the establishment 
of a National Advanced Robotics Research Centre, a 
focal point of expertise in the United Kingdom for 
research and engineering in the field of advanced 
robotic systems and technology. In July 1987, the 
success of a bid led by Salford University Business 
Services Limited was announced. The plans contained 
within this bid proposed the creation of a National 
Centre, run by a Company — Advanced Robotics 
Research Limited (ARRL) – to carry out generic and 
focused research areas central to the study and 
application of advanced robotics, and to provide a 
stable framework of organization and technology upon 
which to support subsequent commercial activities. 
ARRL’s Research Programme activities were 
supported by DTI grant funding to the tune of £5 
million plus resources contributed by a number of 
collaborating industrial and academic organizations. 
In part, collaboration took the form of personnel 
secondment — placing experienced engineers at the 
Centre to work on the Research Programme — and by 
taking up shares in the Company. The Company's core 
technical programme incorporated three fundamental 
levels of development: a basic system functional 
architecture to act as a coordinating technology 
framework for all Programme activities, a series of 
*Generic Research' (GR) projects, providing solutions 
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to problems in specific critical areas of component 
enabling technology, and a series of ‘Research 
Demonstrator’ (RD) projects each proving feasibility 
of systems representing some subset of the overall 
functional architecture. Of relevance to this article are 
the following RD Projects, all of which were integrated 
by efforts carried out under ARRL’s much-publicized 
Project GR6: Human-Machine Interfaces (Virtual 
Reality & Telepresence). 
RD8: 3D Scene Imaging Using a Laser Range 
Finding System, 
RD1: Computerized Modelling of Remote 
Worlds (Based on Laser Rangefinder 
Data), 
RD7: Teleoperation of, and Telepresence for, 
Mobile and Manipulative Robots. 
The key theme of ARRL’s Project GR6, then, was that 
of exploiting knowledge about the natural capabilities 
and limitations of the human operator in the design of 
control and display subsystems for advanced robots. 
This contrasts sharply with previous practices in 
remotely operated vehicle design where primitive 
human-system interfaces have, as a result of using 
of non-intuitive and often cumbersome equipment, 
forced operators to attempt to control a remote vehicle 
or manipulator in an inefficient and sometimes 
unsafe manner. 


2.5.2 Virtual Reality & Telepresence 

The original aim of the GR6, or VERDEX (Virtual 
Environment Remote Driving EXperiment) Project, 
then, was to develop an interactive, head-controlled 
audiovisual display system for telepresence/remote 
driving studies using mobile robots. Such a system 
was to be capable of presenting combined stereo TV 
pictures with graphical representations of familiar 
control devices in a telepresence ‘driving’ simulation. 
Interaction with these controls was to be achieved 
using a virtual hand representation, slaved to the finger 
flex and hand tracking sensors of an instrumented 
glove (the VPL DataGlove). 

As work progressed over the first 18 months in 
building the Test Bed, some conceptual and practical 
problems emerged. Foremost were limitations in 
commercial helmet-mounted display technology, 
particularly with regard to the poor resolution and 
contrast of current liquid crystal display units. A second 
issue concerned the reliability and calibration problems 
of the instrumented gloves, questioning how intuitive 
and reliable to use they actually were. A third issue 
centred on further limitations, this time with the 
computing technology required to drive the VERDEX 
Test Bed (restrictions on world model size, primitive 
image quality, lengthy image generation times). The 
main concern, however, was that, if the telepresence 
research goal was to exploit the human’s natural 
skills in MMI design, then interaction with a virtual 
control panel was not the way to achieve it. 

Therefore, the goals of the Centre’s VR Programme 

were rewritten early in 1992, with a new aim of 
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allowing the visual reconstruction of a remote 
environment by using the geometric output of non- 
visual sensors (possibly correlated and integrated 
with pre-existing computer-aided design (CAD) 
representations of that environment). The term ‘non- 
visual’ applies to sensors (such as Project RD8’s laser 
scanner, ultrasonics, radar) which would have to be 
brought online, should the robot be deployed in an 
environment where such features as dust, smoke, 
sprinkler spray, fire, turbid water and darkness render 
TV pictures useless for ranges greater than, say, 2 
metres. The updated CAD or processed range image 
model, showing the current state of the environment 
(eg. damage, obstacles) would, using ARRL’s VR 
computing system, be displayed as a stereo virtual 
image to a human operator. Exploration of this virtual 
world would then occur using the human’s natural 
skills. Ultimately the robot would follow the actions 
of the operator, employing its own local collision 
avoidance sensors where appropriate. Once the operator 
was close enough to a task which required more 
detailed visual information than that currently 
offered by VR systems, and, provided the range 
of the task from the robot-mounted TV cameras was 
adequate (given prevailing environmental conditions), 
then he could switch to stereo video mode, using a 
speech recognition system (for example), to complete 
his mission. 

The output of the RD8 Laser Rangefinder in use at 
ARRL was, at the time of the change in emphasis of 
GR6, already providing segmented 3D planar and 
volumetric descriptions of objects in a scene in less 
than 2 seconds, and in 1991, ARRL demonstrated the 
feasibility of converting these descriptions into 3D 
virtual images, suitable for display on a stereoscopic 
headset. Later, in 1992, spatially fused images were 
converted for display using a VR engine. Also as part 
of the telepresence concept, and for close-in task 
performance (ie. on switching to stereo mode), ARRL, 
together with Overview Limited (of North London, 
UK), developed a range of head-slaved stereoscopic 
camera systems for the GR6 Project, capable of pan/ 
tilt carriage speeds of the order of 1800*/sec., and 
suitable for mounting on the Company’s Cybermotion 
K2A Mobile Robot. 

In order to permit the switching between, or merging 
of virtual imagery and real video, a multi-transputer/ 
i860 computing engine was commissioned to 
coordinate these and other real-time aspects of the 
GR6 Test Bed. This system, again the first of its kind 
in the world, is called SuperVision, developed from 
ARRL’s original Vision Computer by Division Limited, 
as described earlier. 

Together with Airmuscle Limited of Cranfield, 
UK, ARRL has also pioneered the use of tactile 
feedback for VR and telepresence applications, in both 
glove and 3D/6D ‘mouse’ form. The ARRL-Airmuscle 
Teletact Тапа ІІМ! Gloves and the Teletact Commander 
spatial hand controller are now commercially available 
products. 
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With regard to virtual world modelling, complex 
virtual models have been built and demonstrated, 
including a model of the Robotics Centre, complete 
with an animated Puma Robot, which can be 
teleoperated in virtual space from an iconic control 
panel and can demonstrate ‘teach-and-repeat’ 
sequences. ARRL has begun to specialize in the porting 
of models and dynamic sequences from commercial 
CAD systems (eg. AutoCAD, Dimension VR Toolkit, 
IGRIP) and other forms of modelling or simulation 
packages onto more sophisticated platforms supporting 
virtual visualization, thereby permitting ‘immersion’ 
and intuitive interaction on the part of the human 
operator, Since 1991, the VERDEX Test Bed Project 
has been further developed to expand the Company’s 
capabilities in VR generally (for new applicaticns, not 
exclusively in the robotics or telepresence domain), 
and to focus on incorporating the latest results from 
some of the other demonstrator programmes under 
way at ARRL (again, particularly RD8 and RD1). 


2.5.3 Integrating VR and robotics 
Taking ARRL's Human Factors work a stage further, 
the Company's RD7 Demonstrator Programme, 
*Teleoperation of, and Telepresence for, Mobile and 
Manipulative Robots', basically set out to prove that 
the same computing systems and control/display 
peripherals used for immersing a human іп a virtual 
environment and giving him freedom of interaction 
could be used as an effective means of controlling 
real robotic equipment. RD7 was, in essence, a 
Systems integration demonstrator, drawing upon a 
number of key developments and skill areas within 
ARRL. On an individual technology level, the RD7 
Project resulted in major achievements in two fields, 
namely the development ofa new Advanced Manipul- 
ator Controller (AMC) for teleoperation (see below), 
and a new low-cost breadboard solution to optical 
head tracking (under evaluation at the time of writing). 
Although an initial emphasis was placed on tele- 
presence in the context of remote driving, one of the 
major technical and commercial achievements of the 
RD7 Programme involved the development ofa new robust 
and singularity-free general purpose robot manipulator 
control system. One of the key human factors aims in 
achieving robotic telepresence was the provision of a 
control system which would avoid forcing the human 
operator to indulge inawkward and fatiguing postures, 
arising due to the limitations of the existing robot 
controller and/or kinematic configuration. 
Teleoperation of manipulators is normally used for 
executing tasks in hazardous environments such as 
those encountered in the nuclear industry and in 
underwater operations. Current systems are mainly 
controlled in joint space (ie. the operator uses an input 
device, typically a joystick or master arm, to specify 
independently the required joint positions). This 
approach lessens the effects of the manipulator’s 
physical limitations and does not suffer at all from ill- 
conditioning of the cartesian to joint space kinematic 
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transform. However, this type of control makes tasks 
such as teleoperated seam welding extremely difficult 
requiring much operator training and skill. There is a 
need for controllers which allow direct teleoperation 
of the end effector cartesian position and orientation 
from the input device. Unfortunately, unlike the case 
of programmed motion, the operator is not afforded 
the luxury of developing a program which carries out 
the task by moving along singularity and joint limit 
free paths, while not exceeding the manipulator’s 
dynamic limitations. A cartesian teleoperation 
controller must be capable of accepting any input 
specified in real time by the operator while remaining 
on the indicated cartesian path and without continually 
stopping in an error state. That is, it must be robust to 
totally unstructured inputs. It is this problem domain 
which was successfully addressed by the robust 
Advanced Manipulator Controller (AMC) developed 
for the RD7 Project, in conjunction with an ARRL 
Teletact Commander spatial input device based on an 
Ascension Bird Tracker (the AMC and Commander 
Controller are now being modified for evaluations 
based on a Polhemus Fastrak Tracking System and 
RRC 7-degree-of-freedom Manipulator). 

The concluding phases of Project RD7 (July and 
August, 1992) involved a series of informal yet 
successful teleoperation/ telepresence trials. These were 
designed to test both the remote handling and driving 
concepts derived during: the Programme, plus the 


' integrity of the interfaces between ARRL’s Human- 


System Interface Laboratory (ie. from the GR6 or 
VERDEX Test Bed), the Mobile Robot Laboratory and 
the Puma 562 Manipulator Work Cell. For passive 
team observation purposes, a further video link was 
established between the GR6 SuperVision VR Comput- 
ing System anda stereoscopic LCD projection display, 
located in the Company Board Room. A series of 
remote dnving and manipulation tasks were designed for the 
trials. Portions of the trials were reconfigured later in 
November, for a demonstration of advanced robotics 
and VR technologies at the House of Lords in London. 


2.5.4 VR at ARRL: the future? 

The VR work of ARRL continues, not only in the field 
of advanced robotics and telepresence. Aero engine 
design, rare CAD package conversion, smoke and 
explosion modelling and visualization, integrated 
VR-multimedia systems, medical visualization, 
nanotechnology (the conversion of scanning tunnelling 
microscope data for atomic surface visualization) are 
but a few of the project areas under way at the Centre 
at the time of writing. 

In the closing months of 1993, the author was 
involved in presentations to the Department of Trade 
& Industry and the Science and Engineering Research 
Council (SERC), calling for a coordinated UK strategy 
in the field of VR. An edition of New Scientist (12 
December, 1992, No. 1851) reinforced this call and 
was noticed by researchers working in support of the 
BBC. On 19 January, 1993, the work of the Company 
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was featured in a 22-minute slot on the BBC's 9 
O’Clock News. To many watching industrialists, this 
was the first time on UK National TV that audiences 
- had been exposed to the serious side of VR —no games, 
no dry ice, no hype — just real applications. As a result 
of this appearance, a number of companies throughout 
the UK contacted ARRL, expressing (a) surprise at the 
technology’s commercial potential, and (b) a 
willingness to support a 2-year applications programme. 
Subsequently, ARRL launched Europe’s first major 
Virtual Reality & Simulation (VRS) Programme, 
specifically designed to meet the future needs of British 
Industry in design, planning and training. 

The aim of VRS is not only to keep industry abreast 
of significant international developments in the field, 
but also to demonstrate to participating companies the 
commercial value of Virtual Reality and Simulation. 
Atthe end of the 2-year Programme, VRS will provide 
the participating companies with sufficient know-how 
to introduce the technologies into their own businesses 
with minimal technical and financial risk. 

Companies have been invited to join the Programme 
on either a Full Membership or Technology Watch basis. 
Full Membership has been taken up by those companies 
who already have a well-defined application and wish 
to sponsor a demonstration of the application using 
ARRL’s VRand simulation resources. Technology Watch 
is a grade of membership designed to accommodate 
those companies who wish to keep a close watching 
brief on short-term developments within VRS, prior to 
choosing an application of their own. Another grade of 
involvementis that of Associate Membership. Associate 
Members bring important hardware, software and 
information capabilities to the Consortium, and benefit 
from being exposed to the technical requirements of 
future industrial users of VR technology. 

Also, following the announcement of a Federal 
Multimedia Initiative between the University of Salford 
and University College Salford late in 1992, the ARRL 
VR Team is actively involved in the establishment of 
new projects and courses which will, in 1993, take 
advantage of the academic resources already in place 
on the two campuses — from surveying to textiles and 
fashion design, from orthopzedics to information 
technology, from electrical and aeronautical 
engineering to nanotechnology and advanced 
microscopy. To assist in this venture, the author has 
recently accepted the post of Professor of Virtual 
Reality within the University's internationally- 
renowned Department of Surveying, and will be 
working alongside the Department's Head — Professor 
Peter Brandon — to put a new and comprehensive VR 
generic research facility in place, to compliment the 
commercial venture coordinated by ARRL, and to 
provide a centre for employment of graduates, as the 
Company's VRS Initiative and related work progresses. 


2.6 Virtual Presence Limited 


One of the other UK Companies worthy of mention — 
and, indeed, one of the key Associate Members of 
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ARRL’s VRS Initiative - is that of Virtual Presence 
Limited. Based in London, Virtual Presence has recently 
done much to spread the ‘serious’ news about the 
benefits of VR for British Industry. The Company's 
main claim to fame is that itis sole European Distributor 
ofthe Sense8 WorldToolKit Package. Virtual Presence 
also distributes other VR technologies, including the 
Spaceball, Polhemus Fastrak and related systems, the 
Virtual Research Inc. Flight Helmet Headset and the 
low-cost VREAM Virtual Reality Development 
Package, which claims to allow the user to define, 
enter and interact with three dimensional virtual worlds 
in real time, using a 386/486 PC with coprocessor and 
standard PC peripherals — mouse, joystick and 
keyboard. Apart from its price (which is less than 
£1000) an attractive feature of VREAM will be its 
ability to support interfacing with specialized hardware, 
such as graphics boards, head-mounted displays, 3D 
mice, 6D trackers and gloves, plus the porting of 
standard 3D object file formats and DXF files. 

There is little doubt that, during the period between 
the reported demise and resurrection of America's 
VPL, Britain's ‘VPL’ experienced a significant and 
short-term increase in orders. As with other UK-based 
distributors of US equipment, whether they and their 
suppliers will be able to cope remains to be seen. 
Nevertheless, given Sense8's success in developing 
and marketing WorlTooIKit and in doing so, some 
would argue, establishing VR's de facto standard 
toolkit, Virtual Presence could well become one of the 
UK’s commercial survivors over what will undoubtedly 
be a 12-18 month period of mixed fortunes for many 
companies. 


2.7 British Telecom: VR and future 
communications for all 

At British Telecom's Martlesham Heath Research 
Laboratories near Ipswich, in addition to basic human 
factors research into VR and the visualization of 
communications networks and telepresence/tele- 
medicine, a project funded by the Company's ‘Action 
for Disabled Customers’ Programme is addressing the 
way that VR technology can be used to assist people 
with disabilities. This will concentrate on areas which 
can be investigated in the short to medium term, which 
are applicable to a reasonably large number of people 
with special needs and which involve or could involve 
telecommunication (for example: gestural recognition 
to permit translation of deaf signing languages, audio 
conferencing, telepresence and entertainment). BT is 
also interested in making sure that disabled people are 
not overlooked as VR standards emerge. 


3. Virtual reality and the academic scene 

Ithas become evident that, with one or two exceptions 
(discussed above), overthe past 12 months or so, many 
academic establishments in the UK have responded to 
the “VR Challenge’ inan almost uncontrolled fashion. 
On the Continent, again with one or two key exceptions 
(outlined later in this article) a similar trend is now 
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emerging, Unsolicited and —it must be said—technically 
ill-informed proposals to banks, libraries and all sorts 
of potential funding bodies are on the increase, many 
of which have to be passed on to the recognized 
industrial VR players for review. Typically (and 
fortunately ~ for the reviewers), appraisals of these 
proposals take remarkably little time, as it becomes 
very clear that the proposal has focused on a particular 
field of expertise of the proposers and not on what is 
actually achievable by current VR technology. The 
methods to be employed for VR implementation of the 
application concerned and the underlying rationale for 
so doing often come over as naive and poorly presented 
— the shortest section in the proposal. The motivation 
implicit in many of these proposals appears to be one 
driven by a need to be involved in VR, rather than a 
sensible treatise of the need of the targeted application. 

An additional worry is the emergence of a very 
large number of under- and post-graduate theses 
reviewing VR technologies and philosophizing about 
the field’s future cultural and psychological impacts on 
society. These have, it appears, caused much grief and 
misery to those industrialists who have been targeted with 
open-ended requests to perform the students’ literature 
searches for them. By the end of the 1993 Academic 
Year, there will be so many under- and postgraduate 
theses addressing the same general (and, sadly, esoteric) 
issues surrounding the field, that an enormous amount 
of time and money will have been wasted. 

Now many academics have reached the ‘let’s- 
reinvent-the-wheel’ stage, challenging the quality of 
concepts and products that have already been in place 
for, and evolved over, a number of years. One recent 
project proposal even offered to develop a new range 
of body tracking systems, VR stereoscopic goggles 
and combined video-graphics toolkits (486PC-based), 
all for £60,000. There is a belief throughout academia 
that incorporating a VR element in their project proposal 
will not only ensure successful acceptance by funding 
bodies, but might even guarantee temporary or longer- 
term employment for those students who are involved. 

Academic involvement must be encouraged, but it 
has to be recognized, particularly by the givers of 
academic grants, that British Industry and commercial 
research set-ups have a vitally important role to play in 
driving and focusing the projects, so that they are 
ultimately useful to the developers of VR technologies 
and applications in a rapidly-expanding ‘market’. 
Academic projects must, in the main, exhibit novelty; 
that is accepted. But if the aim is to spawn а new set of 
technologies which will ultimately find their way onto 
the VR market, then close liaison with recognized 
industrial players must be pursued. 


4. A wider European perspective | 
In Europe as a whole, the VR Research and 
Development picture has only recently become 
apparent, although again, one cannot hope to cover all 
activities in an article of this sort. Of course, the big 
news of 1992 was the future role of the French industrial 
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giant Thomson-CSF in European VR, following the 
serious reduction in operations of VPL Inc. Besides 
internal power struggles, reports indicate that the 
minority shareholding held by Thomson-CSF's venture 
capital division culminated ina loan secured on VPL’s 
patents, most notably the so-called ‘global’ patent, 
covering the use of instrumented gloves for interaction 
with computer-generated and virtual graphics. This 
patent has already been cited in litigation against 
potential US Competitors, suchas Virtex’s CyberGlove, 
and it remains to be seen how the remains of VPL, not 
to mention Thomson-CSF, attempt to exploit this patent 
protection further. Those researchers who have the 
time to read entries in the sci.virtual worlds news field 
on electronic mail will have noticed an ‘industry flash’ 
early in December which contained reports from the 
Washington DC VR Conference that the survivors of 
the VPL demise might well try to raise short-term cash 
by enforcing the contents of the glove patent. With 
VPL’s ‘resurrection’, following reports ofa successful 
shareholders’ meeting in February of 1993, no doubt 
the patent issue will stay live for some time to come. 
There are those who appear to be unconcerned at 
this prospect, questioning the overall validity and 
originality of the patent’s claims. There are, however, 
others, who are (quite rightly) keen to exploit their new 
glove technologies, for the benefit of their own 
organization and that of European collaborators. One 
example here is that of one of the more promising 
ESPRIT II Projects under the programme name of 
GLAD-IN-ART (Glove-Like Advanced Interface for 
the Control of Manipulative and Exploratory 
Procedures in Artificial Realities), Under the leadership 
of AITEK srl of Italy and with the collaboration of the 
Scuola Superiore di Studi Universitari e di 
Perfezionamento (Pisa, Italy), Technology Application 
Group (UK), Trinity College Dublin (Eire) and Video 
Display Systems SpA (Italy), GLAD-IN-ART 
commenced in 1990 and is due for completion in 
December of 1993. This Project is concerned with the 
development of an advanced interface system capable 
ofallowing the human operator to interact with entities 
in virtual environments. The Project plans to produce 
a generic test bed for the evaluation of instrumented 
gloves, video processing of gesture commands and 
tactile/force feedback. The design concept underlying 
the GLAD Input/Feedback Glove (currently the focus 
of an international patent application) is based on the 
use of microactuators and Kevlar ‘tendons’. Tracking 
technologies for the GLAD initiative have been 
developed at Trinity College, Dublin, based on optical 
and image processing techniques. An exoskeletal input- 
feedback system is expected to be developed to full 
prototype status in 1993. The GLAD Team is one 
example of a collaborative project which looks as if it 
has produceda significant market product, yet is hesitant 
in taking that product to market because of the VPL 
patent. One can only hope that there are no more 
attempts to quash innovation such as that evident in the 
GLAD Project, thus allowing more reliable and usable 
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glove technologies to be exploited by those with specific 
applications in mind. 


4.1 Virtual acoustics 

Work in the field of acoustics for VR applications does 
not feature highly in reported initiatives throughout 
Europe, although one should notignore here the support 
and commercial exploitation by Division Limited of 
such products as the Convolvotron, Acoustetron and 
Beachtron (developed by Crystal River Engineering 
of the US — now part of Division Inc. on the West 
Coast). Perhaps the best example of an initiative in 
virtual sound is that centred at the Ruhr-Universitát in 
Bochum, Germany, which is concerned with 
developing virtual acoustic models of rooms and 
auditoria. Projects in this field have been in place for 
many years at Bochum and have recently received a 
new wave of interest thanks to the establishment of a 
Telepresence Consortium at the University, which 
coordinates Research and Development, together with 
the transfer of technology into VR applications. 


4.2 VR and the European Space Scene 
Researchers at the European Space Research and 
Technology Centre (ESTEC), in Noordwijk, Holland, 
have been keeping a watchful eye on developments in 
VR and an Invitation to Tender for a fully-funded VR 
Test Bed Project called MVS (Man in Virtual Space) 
was issued late in 1991. Despite numerous European 
proposals being supported by very strong and 
established UK VR concerns, ESTEC took what many 
researchers (including, reportedly, some within ESTEC 
itself) believe to be a highly political decision in 
awarding the MVS contract to Videosystem of Paris (a 
company involved in graphics design for special effects 
and advertising, and a now ex-distributor of VPL 
equipment and software). There are also reports that 
ESTEC has been considering using VPL's DataSuit, 
or even a custom-built system, to study human 
anthropometry and biomechanics in parabolic flight 
aircraft, used to provide a temporary zero-gravity 
environment. On a more general level, there is interest 
in using elements of VR technology in the area of 
training for Extra-Vehicular Activities (EVA). For 
many years now, the Spacecraft Operations Simulation 
Facility at Martin Marietta in Denver, Colorado has 
been using a form of virtual reality, whereby a trainee 
astronaut can “Пу” around a virtual Shuttle, or Space 
Station Freedom, using a multi-axis Manned 
Manoeuvring Unit (MMU) simulator. The computer- 
generated images are presented to the trainee using a 
large screen. VR is now being considered as an 
alternative to the costly neutral buoyancy facilities and 
full-scale mock-ups currently in use by NASA and 
under future consideration by ESA. 


4.3 Virtual Olympics? 

An impressive multi-user VR Project, based at Silicon 
Graphics (SG) in Spain has Olympic sports as its 
focus. The goals of this project are twofold, namely to 
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develop VR environments for (a) a two-user skiing 
simulator, and (b) a four-'explorer' scenario in a 
digitized virtual representation of the 1992 Summer 
Olympic Finals in the Barcelona Stadium (the 3D 
digitized images will be provided by CAR (Centro de 
Alto Rendiniento), a Spanish Sports Research Institute. 
In addition, there is further interest in the possibility of 
exhibiting the UC Davis Bobsleigh Simulator in VR 
form. At the time of writing, the Project uses an SG 
Skywriter 440 with 2 VideoSplitter subsystems, 2 SG 
Indigos, 4 VPL EyePhone LXs, 2 Polhemus 3Space 
Trackers (8 sensors in total) and a single VPL 
DataGlove. In the skiing simulator, 2 users will be 
present at the same time in a shared virtual ‘resort’. 
Each user will be able to see a virtual representation of 
the other. Four Polhemus Trackers will be provided 
per user in order to track the knees (1), the hands (2) 
and the direction of view (1). The users will wear an 
actual pair of ski boots and skis, which will be fixed to 
aflatbase with only one possible movement (rotation). 
It is of interest to note that a Japanese virtual skiing 
system (incorporating simple motion plates) was 
featured on what has become affectionately known as 
the ‘..and finally...’ spot on a British TV News 
Programme during the 1992/93 Christmas Holiday 
Period (the simulator, developed by NEC, was also 
reviewed in the December issue of VR News). Although 
the system has been primarily designed for ski training 
and physiological assessment, it must be said that the 
graphics shown on this short feature were significantly 
inferior to those developed by the Spanish. The only 
drawback of the Spanish skiing simulator as it existed 
in 1992, however, was the absence of motion. Reports 
from those who used the system indicated that the 
experience was more akin to flying over the Alps at 
Mach 3! 

In passing, it should be noted that Spain has, 
perhaps, become the most enthusiastic of all the 
Continental European Countries in terms of the future 
of VR. Inthe view ofthe author, Spain is also the most 
collaborative of all European Countries, with 
researchers and companies more than willing to share 
experiences and work with other nations throughout 
the Community. Already small commercial set-ups 
and consultancies have been established, and exciting 
new university courses in Applied VR are being planned 
for the new academic year starting in 1993, 


4.4 Dutch VR initiatives 

VR research activities in the Netherlands have, over 
the past 6-10 months become quite intense, with work 
being carried out in generic hardware/software 
architectural issues, human factors aspects and 
industrial and training applications. 

For example, the TNO Institute's Physics and 
Electronics Laboratory in the Hague (The Netherlands) 
has been involved for many years in the development 
of training and simulation systems for various defence 
applications. New human interface techniques, such 
as those currently being investigated in VR work are of 
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prime interest for the development of advanced trainers 
and simulators. Direct stimulation of human senses 
(eyesight, auditory and tactile), and new paradigms for 
user input, will improve the realism of simulations and 
thereby the effectiveness of training systems. The 
virtual environment project at TNO-FEL aims at the 
improvement of current simulation systems by 
enhancing the human interface and thereby effectuating 
full inclusion of the user within the simulated 
environment. Earlier work carried out at TNO-FEL 
concentrated, amongst other topics, on the visual sense. 
Much effort has been expended in the development of 
systems for realistic real time visual simulation. At the 
basis of these systems lies the use of parallel processing 
architectures using general purpose processors, 
resulting in flexible and scalable systems that can 
easily be tailored for specific applications at both the 
hardware and software level. Due to the use of these 
powerful architectures, the current generation of visual 
systems features real time visualization of photo- 
textured environments. 

For the rapid development of virtual environment 
applications, a flexible software platform is required 
that fully exploits the underlying parallel hardware 
architecture, while offering a high level interface to the 
developer of virtual environments and simulation 
scenarios. One of the few systems available is the dVS 
Distributed Virtual Environment Operating System, 
developed by Division Limited. TNO-FEL has obtained 
a'source licence for this software and is using it on 
different parallel hardware architectures, both shared 
memory as well as distributed memory MIMD systems. 
The software will be used for a number of research 
projects that aim at assessing the suitability of VR 
technology for advanced trainers and simulators. 
Research topics include the application dependent 
evaluation of new human interface techniques, 
improvement of visual depth cues, addition of auditory 
and tactile interfaces, interaction techniques, improved 
simulation models, and useability in different areas. 
Although the application areas primarily considered 
are trainers and simulators for defence and aerospace, 
it is envisaged that the results will be useful for other 
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Rail. The proposed system will use two linked PCs. 
One of these will be used to run existing railway 
simulation software to control and model the train's 
motion. The second PC, hosting WorldToolKit and an 
Intel Digital Video Interactive Action Media 2 board, 
will provide the graphical features under test (ie. new 
trackside indicator designs, environmental effects) on 
a large-screen display. One of the useful features of rail 
transport when considering simulation is that trains 
move in reasonably predictable ways, being restricted 
to fixed tracks. This means that the visual fidelity of 
the simulation can be enhanced through the use of 
canned video sequences, with the VR Toolkit providing 
the freedom to superimpose graphics as and when 
required. 

Another Dutch Project worthy of mention was 
recently completed by the Calibre Institute, part of the 
University of Eindhoven, and involves the future 
training of architectural students, as well as use of VR 
for design and consultancy projects. The Institute’s 
development package consists of a suite of CAD-like 
functions, referred to as Computer Aided Architectural 
Design (CAAD) software. The software is hosted on 
an early Division Pro Vision VR engine, linked to a PC 
host running UNIX and X-Windows, and visualization 
is provided by means of either a Virtual Research 
Flight Helmet or a VPL Eyephone Headset. 

The CAAD software provides a toolkit. for the 
creation, modification, and visualization of architectural 
structures. The importing and exporting of AutoCAD 
DXF files is supported, and the software can be used to 
add animation and dynamic behaviour to objects in the 
virtual environment. The main use of the system to 
date has been to model a small Dutch town, consisting 
of several hundred buildings, with the specific aim of 
designing a museum located on an island in a river. 
The town is built on the slopes leading down to the 
river, and overlooks the island. A map ofthe town was 
first digitized, creating a basic AutoCAD model. Each 
building was then placed in its proper 3D position, 
taking into accountthe contours ofthe hillside. Further 
architectural detail was then added, including building 
features such as door and window-frame designs, in 
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London (demonstrating VR world modelling using 
Alias Upfront and the Sense8 WorldToolkit). The key 
speaker from Belgium was Philippe Van Nedervelde, 
who presented a paper entitled ‘Making Virtual Reality 
a Reality in Belgium’. During his address, Van 
Nedervelde echoed some of the cries for academic and 
industrial awareness being raised in other countries, 
although he admitted to arguing from a position of 
weakness. Belgium has very little research and 
development ongoing in the field of VR. 

Nevertheless, the Brussels event, organized by 
Nautilus Projects (a Brussels consultancy outfit), 
demonstrated the enthusiasm of the Belgian people 
towards this new technological field. Many hundreds 
of people came to learn about international 
developments and of the establishment of a Belgian 
VR Society, which had been officially launched in 
October of 1992. The goals of the Society, which looks 
upon itself as a professional organization, are to 
promote, support and actively undertake the study, 
practical development and use of VR in the broadest 
sense, and of VR technologies and products in the 
narrower sense. These goals the Society aims to achieve 
through the establishment of a VR Laboratory and 
‘Info Centre’, adopting the MIT Media Lab and 
Seattle’s HITLab as role models. 

The Society’s membership is growing steadily. 
Already, a consortium has been formed (interestingly, 
outside the auspices of the EEC technology and funding 
initiatives), consisting of Lernout & Hauspie Speech 
Products, developing a multilingual speech interface; 
Optronic Instruments and Products, investigating semi- 
or fully immersive head-mounted displays; the Applied 
Computer Science Laboratory of the Limburgian 
University Campus, researching transputer-’boosted’ 
real-time 3D graphics; Roland Benelux; developing 
3D auditory displays; Bemar Projects, addressing 
multimedia inputs in virtual environments; and Asimix, 
investigating a ‘head-mounted olfactory stimulator’. 


4.6 Germany’s VR Demonstration Centre: 
another studio? 

The Fraunhofer Computer Graphics Institute (IGD- 
FhG) in Darmstadt is one of about 50 Fraunhofer 
Research Institutes in Germany (a good number of 
which have already embarked on VR projects, such as 
the telerobotics test bed work of iPA-FAG in Stuttgart 
— which bears a remarkable likeness to that carried out 
inthe UK by ARRL). IGD’s Research and Development 
work concentrates on computer graphics related areas 
such as visual simulation, visualization, imaging and 
VRand others. IGD conducts applied contract research 
in medium and long-term projects requiring complex 
infrastructures, technical support and professional 
project management. Most of the work carried out by 
IGD is development and industry-driven. IGD is closely 
related to the Technical University and Computer 
Graphics Centre in Darmstadt. About 100 scientists 
are involved in Darmstadt in the development of 
graphics hardware and software in related fields. FhG- 
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IGD plans to establish a VR Demonstration Centre by 
the end of 1992 to evaluate the impact of VR in future 
systems and interfaces. The mission of the Centre is to 
provide consumers and producers with access to 
leading-edge visualization, simulation and VR 
technologies in a test bed environment where new 
ideas and experiments can be validated through realistic, 
hands-on experimentation. 


4.7 VR initiatives in Sweden 

A Swedish Research Project, MultiG, is a multiple- 
participant effort to develop a national multi-giga bps 
fiberoptic data communications infrastructure, to 
develop protocols and hardware for its utilization and 
to define applications that can use the bandwidth to 
advantage. Coordinated by SICS (the Swedish Institute 
of Computer Science, which conducts research in 
collaboration with universities and industry), one of 
the sub-projects within MultiG is Telepresence, in this 
case an attempt to develop a distributed environment 
for virtual worlds applications. The main participants 
are the Interaction and Presentation Laboratory of the 
Department of Numerical Analysis and Computing 
Science at the Royal Institute of Technology and the 
Distributed Systems Laboratory at the Swedish Institute 
of Computer Science. The collaboration is being 
included in the EC COMIC project, with the 
participation of the University of Nottingham. The 
Telepresence system uses the ISIS package from 
Cornell University for distribution and can display 
graphics on SPARCstations, IBM RS/6000 work- 
stations and Silicon Graphics computers. 

Another sub-project within MultiG concerns the 
building of a VR system called DIVE (Distributed 
Interactive Virtual Environment). DIVE is a Unix- 
based heterogeneous distributed system that is easily 
extended to new hardware platforms and graphics 
libraries (DIVE is currently running on Sun SPARCs, 
IBM RS/6000s and SG Indigos). Since DIVE is fully 
distributed (using the ISIS distributed programming 
toolkit from Cornell University), many processes on 
different nodes in the network can work in the same 
world. The nodes are divided into visualization nodes 
(which handle user interfaces) and computational nodes 
(where ‘AI’ processes run, ап А] in this case being a 
process that handles objects in the world but does not 
have a renderer — for example, a process that turns the 
hands of a clock, implements gravity or checks for 
collisions). SICS distributes the system, and a new 
version (V2.0) is scheduled for release before the end 
of the 1992. 

Version 2.0 has a much more detailed object model 
than Version 1.0, with structured object hierarchies 
and cumulative transformation matrices, making it 
easy to move objects relative to themselves, their 
super-objects, the global origin or even arbitrary objects 
in the world. Objects can now have a ‘behaviour’ - 
basically one can specify a DFA (Deterministic Finite 
Automaton) for each object, with arcs between states 
performing actions such as moving the object, making 
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it invisible or sending signals to other objects. State 
transitions are triggered by incoming signals to the 
object. The interface to ‘AJ’ applications has been 
structured and expanded, making it easier to create 
interesting applications. The ‘visor’ concept has been 
incorporated in the system so that an AI can attach 
objects to an invisible visor hanging in front of the 
user’s eyes, making it easy to put interface controls 
within comfortable reach of the user. 

The overall implementation strategy is pretty much 
the same — SICS use separate worlds with opaque 
gateways between them to keep the amount of 
information manageable, and ISIS process groups for 
interprocess communication. 

Most processes have their own copy of the world 
database, just as in Version 1.0, but a new process has 
been added, referred to as ‘lightweight’ AIs — those 
that do not possess any world state. These processes do 
not participate in state transfers and multicast 
communication, so they do not cause any system 
overhead. On the other hand, they cannot perform any 
computation which is dependent on the state of the 
database, and their applicability is therefore somewhat 
limited. 

The new version still supports only CRT output, 
with a mouse and an Ascension Bird as input devices. 
It is, however, relatively easy to extend it to other /О 
units. The renderer is a separate application, and input 
devices are handled by ‘vehicles’ — software modules 
that map from input devices to actions in the DIVE 
worlds. 

Whilst DIVE V2.0 is being completed, ‘AI’ 
applications are being created to be used in the system. 
One ongoing application is in collaboration with the 
Swedish Institute for the Handicapped ona project that 
involves controlling a real-world robot from within the 
virtual environment. With visor control, it is possible 
to switch viewpoints to enable, for instance, precision 
operations ‘looking out’ from the virtual robot gripper. 

Another application relates to virtual conferencing. 
One concept here concerns the development of a 
‘distributed whiteboard’ (which has become SICS’s 
contribution to the field of VR documents) to work in 
the new version of DIVE. Several copies of a 
whiteboard can exist in the virtual world, and anything 
‘drawn’ on one is displayed on all the others. Ancther 
interesting concept is that of virtual ‘aura’. An aura is 
basically an invisible volume round each user’s 
graphical body, and can be used to determine when 
users have come close to each other (the auras collide). 
When this happens, the system can, for example, 
establish a speech connection between these users. 

Other SICS work includes the development of a 
colour head-mounted display using 1" CRTs and Liquid 
Crystal Shutter displays from Tektronix, and the design 
and build ofa ‘sound renderer’ or ‘auraliser’ for DIVE. 
The idea for the sound renderer is to use a ‘source/ 
filter/listener’ audio model to allow VR objects or 
events to have sounds, and for these sounds to be 
spatialized and mixed for each user’s perspective (just 
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like the visual renderers for multiple users display 
different perspectives of the state of a virtual world). 


5. Concluding statements 

Despite the somewhat uncoordinated picture of VR 
Research and Development in Europe portrayed here, 
there are, without doubt, a number of excellent projects 
currently under way throughout the Continent. The 
potential for VR and related projects throughout Europe, 
it seems, is enormous. Not a single week goes by 
without one or more new (not necessarily serious) 
ideas or areas of interest being expressed by visitors to 
Research and Development Centres throughout the 
Continent: virtual landscape gardening systems, VR 
for psychic and mystic training, networked VR and 
multimedia demonstrators, timeshare apartment design, 
virtual mining simulations, offshore (subsea) 
visualization, VR for the training of ‘Persons On 
Board’ in the oil and gas industry — offshore platform 
layout and evacuation familiarization, health and safety 
standards for VR, interactive aircraft engine design, 
medical applications (including body scanner data 
visualization, orthopedics, minimally invasive surgery, 
psychotherapy, migraine simulation), virtual shopping, 
wholesale warehouse management, kitchen and kitchen 
appliance design, space activities training, submersible 
control... the list goes on. 

The industrial and commercial interest is definitely 
there, yet the financial resources are not. Very few of 
the companies and commercial organizations 
expressing a strong desire to drive or be involved in 
some of the project areas listed above have the resources 
to provide the 50% funding demanded by the somewhat 
unique European political mentality for fostering 
technological innovation. Many of the exciting ideas — 
those with real applications potential — are conceived 
within small companies (and even within academia) 
who then have to sell out to other, non-European 
Nations, to acquire the relatively small development 
budgets they require. In doing so they stand to forfeit 
their control over the distribution and exploitation of 
intellectual property. 

Until now the absence ofa serious VR coordination 
effort throughout Europe (despite the submission to 
ESPRIT Il of sucha venture, led by the Scuola Superiore 
di Studi Universitari e di Perfezionamento, referred to 
earlier) has prompted the establishment of many so- 
called ‘Demo Centres’, ‘Info Centres’, VR Clubs, VR 
‘Operations’, User Groups and the like, typically formed 
by small bands of enthusiasts or overnight specialists, 
a good many of whom have had to turn to established 
centres of excellence to obtain the knowledge they 
then sell on to industry and academia! 

There is a genuine hope that the commercial 
funding situation for real applications will improve 
dramatically throughout 1993, even with the 
prevailing economic problems. Bob Jacobson, of 
WORLD DESIGN in the States summed up the 
situation facing developers and users of VR succinctly 
in 1992 by stating that many companies expressed an 
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interest in placing a contract for VR develo, ment 
with him. The problem was that, until very recently, 
not one of those companies wanted to be the first to 
do so! As with many aspects of VR, a similar state of 
play exists throughout Europe, albeit trailing a year 
or so behind the American experience. 
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Abstract 


In many respects the office has changed little over the past 200 years. The introduction of the telephone, copier, fax 
and computer have only served to speed up and proliferate the basic processes. We are now faced with an increasingly 
complex and difficult environment that requires fundamental changes to humanize the processes. In this paper we 
address some of the interface issues that now appear to have near term solutions. 


Prologue 

Life has evolved over the last 600 million years with 
man coming to true superiority in the last 100 thousand 
or so. This evolutionary process has equipped us for an 
environment consisting of modest size entities moving 
at reasonable speed in reasonable time frames. In 
contrast the evolution of technology during the past 60 
years has placed us in a foreign environment with 
massive or minute entities moving at incredible speed 
in very short time frames with very little or no perceptive 
delay between events. That is: the electronic evolution 
has now outstripped the rate of our biological evolution 
to cope with change. Moreover, our interface with tech- 
nology has generally been designed for the convenience 
of the technology and is not intuitive or biologically 
matched to our abilities. If we are to change the office 
and the working environment significantly in the future, 
then these issues have to be addressed and the interface 
has to have a human orientation. 

Computer and communications technologies now 
look ripe to introduce some radical and long overdue 
change. All the technology and know-how is available 
(in abundance) to revolutionize the office, the home 
and the place of work far beyond the evident results of 
incrementalism we currently enjoy. In many respects 
we might now therefore consider the modern office to 
be an unnatural, and even hostile environment for 
most humans. We currently suffer interfaces and 
conditions that are not convenient, user friendly, or 
conducive to efficient and pleasant operation. This 
applies for the interfaces between humans or individuals 
and machines. This is not however, in our view, a 
necessary condition and perhaps more importantly, it 
is not sustainable in the long term. The question is: 
what is sustainable, and what happens next? 

Addressing this question and the likely solutions to 
the proceeding problems presents the main theme of 
this paper. We thus explore a series of proposals for 
potential future office environments. These follow an 
underlying holistic approach to the integration of 
existing and new technologies to create a new IT 
environment. This is integrated, intuitive and responsive 
to human needs. It also places both the user and the 
tasks/work as the central focus. 


Wishes 

I wish: 

e My phone was always with me. 

e My desk wasn't covered with machines and cables. 

e I could find people when I want them, and they 
could find me. 

e I could get rid of all this paper. 

e I didn't have to travel so much. 

e I could be in two places at once. 

e My PC wasn't so thick and unfriendly. 

eI could simultaneously see multiple (and full) 
electronic pages. 

e My PC screen was flat, horizontal and had the 
qualities of paper. 

e I could talk to my PC and it to me. 

e I had an automatic database of all my contacts. 

e My mail was electronically sorted, summarized, 
prioritized and filed. 

e I could voice annotate documents. 

e I had all the power of my office wherever I was. 

e It didn't take 3 months to move office and restore 
full IT. 

e etc, etc. 


So what are the critical barriers to meeting this wish 

list? Some would be: 

e Limited office wiring and communication bandwidth. 

e Constraints of cordlessness and portability. 

e Teleconferencing was not so limited and lacking 
realism/facilities. 

e Too much THICK (unintelligent) paper. 

e Information overload, categorization, filing and 
retrieval. 

e Inadequate and inhuman interfaces. 

e Multiple devices which don't easily integrate. 

e Storage and processing power available only as 
hardware. 

ө etc. etc. 

So here is a proposal for a method of breaking 
down these barriers in the office environment centred 
around the realization of a ‘future desk’. The desk is 
realized with currently available technology integrated 
to satisfy all of our known and well defined 
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requirements, but with the inclusion of a set of human 
orientated interfaces. 

Specific features of the desk include: optical 
communication that is cordless and large bandwidth; 
built-in equipment and an active surface for document 
display, manipulation and cordless/active peripherals; 
multi-standard input and output devices; intelligent 
non-intrusive interfaces; software filing, summarizing, 
and correlating; intuitive and ergonomic control 
systems; built in recognizers for ‘hot desking’, with a 
secure data environment; teleconferencing with human 
scale interactive images; hi-fi acoustics; voice I/O and 
command. Let us look at some of these features in 
detail in the sections that follow. 


Office wiring 

One of the major limitations of present day office 
design and realization is the necessity for hard wired 
desks, Even with the exciting optical fibre technology 
developments there still remains an underlying problem 
with the cabled office: getting fibre or cable to where 
you want it. With the increasing demands for more 
communication this is likely to become even more 
problematic in the future. In most sectors the speed of 
market and technology induced change implies regular 
re-organizations and movement of staff and operations. 
Itis notuncommon to find that moving office currently 
involves a delay ofsome 3 months before all electronic 
back-up systems are terminated at your desk. 

Optical wireless affords an important means of 
short range, diffuse and line-of-sight fixed and mobile 
communication for inside the office without the 
regulatory or frequency restrictions ofradio alternatives. 
Furthermore the bandwidth ofthe channel is potentially 


Optical wireless in the office-ceiling satellite 





Fig.1 Office communications using diffused infra-red light 
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as broad as cable-based optical fibre systems, thereby 
allowing broadband multi-channel services. The 
principle is directly analogous to radio. Data can be 
omnidirectionally radiated from a ceiling, desk or 
body mounted antenna and transceiver (Fig 1). 
Transceivers using holographic dispersers can 
illuminate very well defined cells so that different data 
domains can be accurately positioned and addressed 
within the office environment (Fig 2). 

So with optical wireless the office can have an 
omnipresent optical ether so that people and their 
desks can be mobile and still have broadband 
communication. People and equipment are thus free to 
roam within a building with no more data, printer, fax 
or telephone cables — only power is required. 
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Fig.3 Lightweight cordless headset using optical wireless 


communications 


The optical ether also enables the use ofa lightweight head- 
set with microphone and earpiece to provide cordless 
communication (fig 3). Furthermore, voice recognition 
software allows direct voice /О with computer and 
communication systems. With intelligence built into 
the cellular optical wireless system the headset can be 
tracked and intelligence built into the cellular optical 
wireless system the headset can be tracked and 
automatic location and activity systems can be used to 
produce *who, where and when' activity databases. 
Combining voice recognition and the location facility 
provides a secure method of ‘hot desk’ operation 
anywhere in an office. Talk to апу desk and it can 
check your identity and configure to your own personal 
definition using the broadband optical communication 
to access you virtual desk’s facilities. 


The desk 

Today desks are passive objects on which we stack, 
and in which we store, things. Technology has made 
them a mass of wires, equipment boxes, keyboards, 
mice and phones; none of which easily work with each 
other and all with their own proprietary interfaces. The 
wiring alone causes configuration nightmares whilst 
the integration of diverse software and hardware is 
rapidly approaching the impossible. One solution to 
this is an active desk with: a built-in optical backplane; 
a partitioned structure used to house equipment; 
an inductive working surface to provide battery 
charging and communication to cordless peripherals; 
ergonomically built-in multiuse displays and input 
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devices; radically new user interface such as ‘hands in 
the screen and eye plus voice tracking’. 

If optical wireless is used in the office, then cordless 
objects could be used instead of wired mice and 
keyboards utilizing the optical ether. Inductive loops 
printed below the surface of the desk (like a car’s 
heated rear window) would charge anything placed on 
its surface. A laptop or active organizer placed on the 
desk would be trickle charged at the same time as 
communicating with the desk allowing the full 
processing power of the desk to be instantly available 
without any physical connections. 

With equipment slots built into the structure of the 
desk multi-vendor devices could be added (like shelves 
in a racking system) and communication links estab- 
lished. This would mean a device such as а CD-WORM 
unit could be purchased and just dropped into a slot. 
Intelligence in the desk would establish power and com- 
munication and integrate it into the computer workspace 
as well as displaying its controls on any desired display. 


Video conferencing 

Video conferencing has the ability to radically reduce 
the need for people to travel and can also deliver anew 
team working media for geographically dispersed 
organizations. With the addition of telepresence 
hardware, a person can literally “be in two places at 
once’. The constrained bandwidth available today for 
‚ this human interface currently produces visual 
anomalies in the perception of the images and is 
detrimental to realizing its full potential. To improve 
and humanize the limiting aspects of videoconferencing 
a different type of interface is proposed. 

A large rear projected HDTV monitor can be 
ergonomically placed in the desk. This produces high 
definition life-size images in front of the user (in a 
natural face-to-face mode)l. By the use of an LCD 
shutter as the screen material a video camera can be 
aligned to be looking directly at the user through the 
screen. This enables a human sized image of your 
conversant with eye to eye contact and gaze awareness. 
Because of the large size of display the peripheral 
vision would be partially filled and create a feeling of 
‘being there’ rather than watching a picture. 


As this display is High Definition, then it can also 


be used as a computer monitor and in many applications 
allows the mixing of videoconferencing and computer 
generated data. 

By using an infra-red emitting pen the screen can 
also be tumed into an electronic whiteboard via infra- 
red sensing in the camera driving the cursor controls of 
the computer. This allows multiple videoconferencing 
participants to work together in the same electronic 
media space in real time. People sitting at desks 
thousands of miles apart come together in an electronic 
media to realize real time team working, 


Hands in the screen 


The preceding user interface has been realized by an 
integration of currently available technologies and 
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working practices. However, the addition of an 
overhead camera, scanning the desk’s surface, and 
producing a positional image of the user’s hand (‘or 
finger worn’ 3D RF positioning sensors) allows the 
realization of an economic ‘hands-in-the-screen’ 
interface. This direct hand control and manipulation of 
objects is linked to the function of the computer and 
peripheral equipment. No keyboard or mouse control 
is necessary; just speak the text and then ‘grab it’ and 
put it where you want it. 

Current work is centred on 3D data visualization. 
A ‘hands in screen interface’ allows modelling and 
manipulation of data and virtual objects. These can be 
placed in the medium viewed through the screen and 
directed by a combination of voice and hands. To 
further enhance the lifelike and intuitive nature of this 
interface, 3D technology is being introduced to add a 
depth of vision, dimension, reality, and personality to 
the environment. Objects are being humanized to react 
emotionally and give heuristic guidance during 
interactive sessions with.movement, stance, colour 
and/or audio to convey reactions. For example: icons 
try to avoid your hand if the action is questionable, or 
become defensive if you are about to initiate a damaging 
direction of actions. Alternatively, you could move 
your hand and grab a document, pull it to the print area 
and it would wriggle and complain as you hadn’t yet 
reviewed the spelling but it wouldn’t actually prevent 
you from printing. 


Electronic post-it 

To ensure that the main working display is not crowded 
with buttons, icons and electronic messages, another 
simple display with a touch sensitive surface and voice 
activation can be appropriately positioned. This can be 
used for telephone directory listings, ‘post-it’ pads and 
soft keys for all desk controls. For example, this 
enables an up-to-date electronic directory to be 
displayed and a telephone call established whilst still 
being part of a video team working session. This also 
acts as the control panel for all equipment that is 
installed in the desk — no more front panels on boxes 
you can’t reach. 

To aid use, frequently used functions remain large 
and easily activated while less frequently used controls 
gradually migrate to lower control layers. To further 
enhance the intuitive nature of these controls they can 
also be assigned ‘emotion’. For example, you might 
say, ‘Phone Granny’. The display would show both 
grandmothers’ names and a dot would impatiently 
dither between the two. A straight ‘Granny Fisher’ 
statement then prompts the phone and it dials out. 


Paper 

The paperless office does not exist and might be 
perceived to be less likely with time as more technology 
is introduced that demands more copies. This trajectory 
is non-sustainable in the long term and we have to 
reverse the trend and halt the growth in paper 
proliferation. 
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The user interface to electronic mail systems can 
be radically improved if our human-oriented user 
interface is applied with a few minor enhancements. 
For example, the scanning of bar coded documents 
allows automatic logging, filing, abstraction and 
tracking. Say the document arrived at 9:15am on 27th 
October. Rob and Phil were with you plus a visitor. 
The text correlator reads the central file copy and the 
key words are ‘Information Exchange’ and ‘publication 
date’. This related information, when automatically 
appended, enables single location filing and retrieval 
via sparse descriptors. This falls precisely in line with 
our abilities. As humans, we can vaguely remember 
the scenario: ‘Rob was with a visitor and it was in the 
morning’. All the documents in this category, complete 
with a video snap of the visitor, can thus be recalled. 

As we move to a multimedia environment then the 
ability to add colour, moving images, sound and 
interaction to documents will lead to paper being a less 
powerful medium. Electronic mail will then include 
video sequences, active directories and databases in a form 
that match your desk’s personal ‘sifter’ and organizer. 


. Memory, processing and communication 

These facilities and anticipated amounts of data are 
going to place heavy demands on memory and 
processing power. However, with the cost of both 
dropping by an order of magnitude over the last 5 
years, and the trend looking set to continue, then the 
technology should be within the reach of most offices 
within the next decade. 

In order to reduce the memory required, a process 
of (Hebbian) data decay is being investigated. 
Documents are reduced in data content with time as 
their perceived importance diminishes. Thus a 
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document with full colour and voice annotation decays 
with time through to a monochrome document with 
low quality audio. Finally it is compressed with only 
contextual and retrieval information easily accessible. 
Regularly used or vitally important documents can 
remain uncompressed and complete. 


Final remarks 

All of the technology described is either available or 
currently under development. However, very little has 
yet been integrated into a complete system that in any 
way reflects the potential gains possible. Creating an 
environment in which people are able to work 
intuitively, organize information and interactions on a 
human scale should be a prime objective. The proposals 
and work investigations presented here are a first step 
towards that intuitive *Office I wish I had'. Our aim is 
to break each of the interface barriers by a human 
orientation of technology to release the joint intellect 
of man and machine. The continued exponential cost 
reduction in electronic data storage, processing power 
and communication bandwidth now make this a real 
possibility. A decade from now could see it generally 
available in the work place. 


BIBLIOGRAPHY 


1. KRUGER, M.W. Artificial Reality II, Addison 
Wesley. 1991. 


2. COCHRANE, P. et al. CAMNET, Interlink 2000, 
Aug 1992. | 


3. ISHII, H. and KOBAYASHI, M. ClearBoard, CHI- 
92 Conference May 1992. 


4. COCHRANE, P. and HEATLEY, D.J. Optical Fibre 
Systems and Networks, ibid (2) Feb 92. 


Aslib Proceedings, vol.45, no.6 


